ANTON YEW eA RP 


ADVANCED, 
ALGEBRA 


To Susan 
and 
To My Children, Sarah and William, 
and 
To My Algebra Teachers: 


Ralph Fox, John Fraleigh, Robert Gunning, 
John Kemeny, Bertram Kostant, Robert Langlands, 


Goro Shimura, Hale Trotter, Richard Williamson 


Anthony W. Knapp 


Advanced Algebra 


Along with a Companion Volume Basic Algebra 


Digital Second Edition, 2016 


Published by the Author 
East Setauket, New York 


CONTENTS 


Contents of Basic Algebra 
Preface to the Second Edition 
Preface to the First Edition 
List of Figures 

Dependence among Chapters 
Guide for the Reader 
Notation and Terminology 


I. TRANSITION TO MODERN NUMBER THEORY 


= 


EO SO 00 SY Goto: 


Historical Background 

Quadratic Reciprocity 

Equivalence and Reduction of Quadratic Forms 
Composition of Forms, Class Group 

Genera 

Quadratic Number Fields and Their Units 
Relationship of Quadratic Forms to Ideals 
Primes in the Progressions 4n + 1 and 4n + 3 
Dirichlet Series and Euler Products 

Dirichlet’s Theorem on Primes in Arithmetic Progressions 
Problems 


I. WEDDERBURN-ARTIN RING THEORY 


= 


BS Be OD OE ea 


Historical Motivation 

Semisimple Rings and Wedderburn’s Theorem 
Rings with Chain Condition and Artin’s Theorem 
Wedderburn—Artin Radical 

Wedderburn’s Main Theorem 

Semisimplicity and Tensor Products 
Skolem—Noether Theorem 

Double Centralizer Theorem 

Wedderburn’s Theorem about Finite Division Rings 
Frobenius’s Theorem about Division Algebras over the Reals 
Problems 


104 
111 
114 
117 
118 
120 


Vili 


Contents 


Ii. BRAUER GROUP 


SESS ON pe eno" 


Definition and Examples, Relative Brauer Group 

Factor Sets 

Crossed Products 

Hilbert’s Theorem 90 

Digression on Cohomology of Groups 

Relative Brauer Group when the Galois Group Is Cyclic 
Problems 


IV. HOMOLOGICAL ALGEBRA 


SORES ON SE A 


Overview 

Complexes and Additive Functors 

Long Exact Sequences 

Projectives and Injectives 

Derived Functors 

Long Exact Sequences of Derived Functors 
Ext and Tor 

Abelian Categories 

Problems 


V. THREE THEOREMS IN ALGEBRAIC NUMBER THEORY 


SLY OR Rn 


Setting 

Discriminant 

Dedekind Discriminant Theorem 
Cubic Number Fields as Examples 
Dirichlet Unit Theorem 

Finiteness of the Class Number 
Problems 


VI. REINTERPRETATION WITH ADELES AND IDELES 


= 


BS SO DY ea 


p-adic Numbers 

Discrete Valuations 

Absolute Values 

Completions 

Hensel’s Lemma 

Ramification Indices and Residue Class Degrees 
Special Features of Galois Extensions 
Different and Discriminant 

Global and Local Fields 

Adeles and Ideles 

Problems 


123 


124 
132 
135 
145 
147 
158 
162 


166 


167 
171 
184 
192 
202 
210 
223 
232 
250 


262 


262 
266 
274 
279 
288 
298 
307 


313 


314 
320 
331 
342 
349 
353 
368 
371 
382 
388 
397 


Contents 


VIL INFINITE FIELD EXTENSIONS 


1. Nullstellensatz 

2. Transcendence Degree 

3. Separable and Purely Inseparable Extensions 
4. Krull Dimension 

5. Nonsingular and Singular Points 

6. Infinite Galois Groups 

7. Problems 


VIII. BACKGROUND FOR ALGEBRAIC GEOMETRY 


IX. 


Historical Origins and Overview 

Resultant and Bezout’s Theorem 

Projective Plane Curves 

Intersection Multiplicity for a Line with a Curve 
Intersection Multiplicity for Two Curves 
General Form of Bezout’s Theorem for Plane Curves 
Grébner Bases 

Constructive Existence 

Uniqueness of Reduced Grébner Bases 
Simultaneous Systems of Polynomial Equations 
Problems 


Fe OP NO OO VON OO 


= 


THE NUMBER THEORY OF ALGEBRAIC CURVES 


Historical Origins and Overview 

Divisors 

Genus 

Riemann-—Roch Theorem 

Applications of the Riemann—Roch Theorem 
Problems 


Qy ON Ror 


METHODS OF ALGEBRAIC GEOMETRY 


Affine Algebraic Sets and Affine Varieties 
Geometric Dimension 

Projective Algebraic Sets and Projective Varieties 
Rational Functions and Regular Functions 
Morphisms 

Rational Maps 

Zariski’s Theorem about Nonsingular Points 
Classification Questions about Irreducible Curves 
Affine Algebraic Sets for Monomial Ideals 
Hilbert Polynomial in the Affine Case 


SOF OO SON LR Ge No 


ay 


Contents 


METHODS OF ALGEBRAIC GEOMETRY (Continued) 


11. Hilbert Polynomial in the Projective Case 
12. Intersections in Projective Space 

13. Schemes 

14. Problems 


Hints for Solutions of Problems 
Selected References 

Index of Notation 

Index 


CONTENTS OF BASIC ALGEBRA 


Preliminaries about the Integers, Polynomials, and Matrices 
Vector Spaces over Q, R, and C 

Inner-Product Spaces 

Groups and Group Actions 

Theory of a Single Linear Transformation 

Multilinear Algebra 

Advanced Group Theory 

Commutative Rings and Their Modules 

Fields and Galois Theory 

Modules over Noncommutative Rings 


633 
635 
638 
644 


649 
713 
717 
721 


PREFACE TO THE SECOND EDITION 


In the years since publication of the first editions of Basic Algebra and Advanced 
Algebra, many readers have reacted to the books by sending comments, sugges- 
tions, and corrections. They appreciated the overall comprehensive nature of the 
books, associating this feature with the large number of problems that develop so 
many sidelights and applications of the theory. 

Along with the general comments and specific suggestions were corrections, 
and there were enough corrections to the first volume to warrant a second edition. 
A second edition of Advanced Algebra was then needed for compatibility. 

For the first editions, the author granted a publishing license to Birkhauser 
Boston that was limited to print media, leaving the question of electronic publi- 
cation unresolved. The main change with the second editions is that the question 
of electronic publication has now been resolved, and for each book a PDF file, 
called the “digital second edition,” is being made freely available to everyone 
worldwide for personal use. These files may be downloaded from the author’s 
own Web page and from elsewhere. 

Some adjustments to Advanced Algebra were made at the time of the revision 
of Basic Algebra. These consisted of a small number of changes to the text 
necessitated by alterations to Basic Algebra, the correction of a few misprints, 
one small amendment to the “Guide for the Reader” about Chapter VII, some 
updates to the References, and some additions to the index for completeness. No 
other changes were made. 


Ann Kostant was the person who conceived the idea, about 2003, for Birkhauser 
to have a series Cornerstones. Her vision was to enlist authors experienced at 
mathematical exposition who would write compatible texts at the early graduate 
level. The overall choice of topics was heavily influenced by the graduate curricula 
of major American universities. The idea was for each book in the series to explain 
what the young mathematician needs to know about a swath of mathematics in 
order to communicate well with colleagues in all branches of mathematics in the 
21 century. Taken together, the books in the series were intended as an antidote 
for the worst effects of overspecialization. I am honored to have been part of her 
project. 

It was Benjamin Levitt, Birkhauser mathematics editor in New York as of 2014, 
who encouraged the writing of second editions of the algebra books. He made a 
number of suggestions about pursuing them, and he passed along comments from 


xi 


Xl Preface to the Second Edition 


several anonymous referees about the strengths and weaknesses of each book. I 
am especially grateful to those readers who have sent me comments over the 
years. The typesetting was done by the program Textures using AjyS-TpX, and 
the figures were drawn with Mathematica. 

As with the first editions, I invite corrections and other comments about the 
second editions from readers. For as long as I am able, I plan to point to lists of 
known corrections from my own Web page, www.math.stonybrook.edu/~aknapp. 


A. W. KNAPP 
January 2016 


Sel. 
4.1. 
4.2. 
4.3. 
4.4. 
4.5. 
4.6. 
4.7. 
4.8. 
6.1. 
6.2. 


LIST OF FIGURES 


A cochain map 

Snake diagram 

Enlarged snake diagram 

Defining property of a projective 

Defining property of an injective 

Formation of derived functors 

Universal mapping property of a kernel of a morphism 
Universal mapping property of a cokernel of a morphism 

The pullback of a pair of morphisms 

Commutativity of completion and extension as field mappings 


Commutativity of completion and extension as homomorphisms 
of valued fields 


XVii 


154 
185 
185 
192 
195 
205 
235 
236 
243 
356 


360 


DEPENDENCE AMONG CHAPTERS 


Below is a chart of the main lines of dependence of chapters on prior chapters. 
The dashed lines indicate helpful motivation but no logical dependence. Apart 
from that, particular examples may make use of information from earlier chapters 
that is not indicated by the chart. 


I II.1-I1.3 TIL.5 VIL1 


VIUI.7 
to to 
VIII.10 VHUI6 


V.4-V.6 VI1-VI.2 


Prop. 2.29 
to 'TX.1-IX.3 
Prop. 2.33’ 
I 
I 
| IX.4 
VI.3-VL8 1] _ to 
! IX.5 
I 


VI.9-VI.10 x 


XVili 


GUIDE FOR THE READER 


This section is intended to help the reader find out what parts of each chapter are 
most important and how the chapters are interrelated. Further information of this 
kind is contained in the abstracts that begin each of the chapters. 

The book treats its subject material as pointing toward algebraic number 
theory and algebraic geometry, with emphasis on aspects of these subjects that 
impact fields of mathematics other than algebra. Two chapters treat the theory 
of associative algebras, not necessarily commutative, and one chapter treats 
homological algebra; both these topics play a role in algebraic number theory and 
algebraic geometry, and homological algebra plays an important role in topology 
and complex analysis. The constant theme is a relationship between number 
theory and geometry, and this theme recurs throughout the book on different 
levels. 

The book assumes knowledge of most of the content of Basic Algebra, either 
from that book itself or from some comparable source. Some of the less standard 
results that are needed from Basic Algebra are summarized in the section Notation 
and Terminology beginning on page xxi. The assumed knowledge of algebra 
includes facility with using the Axiom of Choice, Zorn’s Lemma, and elementary 
properties of cardinality. All chapters of the present book but the first assume 
knowledge of Chapters I-IV of Basic Algebra other than the Sylow Theorems, 
facts from Chapter V about determinants and characteristic polynomials and 
minimal polynomials, simple properties of multilinear forms from Chapter VI, 
the definitions and elementary properties of ideals and modules from Chapter VIII, 
the Chinese Remainder Theorem and the theory of unique factorization domains 
from Chapter VIII, and the theory of algebraic field extensions and separability 
and Galois groups from Chapter IX. Additional knowledge of parts of Basic 
Algebra that is needed for particular chapters is discussed below. In addition, 
some sections of the book, as indicated below, make use of some real or complex 
analysis. The real analysis in question generally consists in the use of infinite 
series, uniform convergence, differential calculus in several variables, and some 
point-set topology. The complex analysis generally consists in the fundamentals 
of the one-variable theory of analytic functions, including the Cauchy Integral 
Formula, expansions in convergent power series, and analytic continuation. 


The remainder of this section is an overview of individual chapters and groups 
of chapters. 


XX Guide for the Reader 


Chapter I concerns three results of Gauss and Dirichlet that marked a transition 
from the classical number theory of Fermat, Euler, and Lagrange to the algebraic 
number theory of Kummer, Dedekind, Kronecker, Hermite, and Eisenstein. These 
results are Gauss’s Law of Quadratic Reciprocity, the theory of binary quadratic 
forms begun by Gauss and continued by Dirichlet, and Dirichlet’s Theorem on 
primes in arithmetic progressions. Quadratic reciprocity was a necessary prelimi- 
nary for the theory of binary quadratic forms. When viewed as giving information 
about a certain class of Diophantine equations, the theory of binary quadratic 
forms gives a gauge of what to hope for more generally. The theory anticipates 
the definition of abstract abelian groups, which occurred later historically, and 
it anticipates the definition of the class number of an algebraic number field, at 
least in the quadratic case. Dirichlet obtained formulas for the class numbers 
that arise from binary quadratic forms, and these formulas led to the method by 
which he proved his theorem on primes in arithmetic progressions. Much of the 
chapter uses only elementary results from Basic Algebra. However, Sections 6-7 
use facts about quadratic number fields, including the multiplication of ideals 
in their rings of integers, and Section 10 uses the Fourier inversion formula for 
finite abelian groups, which is in Section VIL4 of Basic Algebra. Sections 8-10 
make use of a certain amount of real and complex analysis concerning uniform 
convergence and properties of analytic functions. 


Chapters II-III introduce the theory of associative algebras over fields. Chap- 
ter II includes the original theory of Wedderburn, including an amplification by 
E. Artin, while Chapter III introduces the Brauer group and connects the theory 
with the cohomology of groups. The basic material on simple and semisimple 
associative algebras is in Sections 1—3 of Chapter II, which assumes familiarity 
with commutative Noetherian rings as in Chapter VIII of Basic Algebra, plus the 
material in Chapter X on semisimple modules, chain conditions for modules, and 
the Jordan—Hélder Theorem. Sections 4-6 contain the statement and proof of 
Wedderburn’s Main Theorem, telling the structure of general finite-dimensional 
associative algebras in characteristic 0. These sections include a relatively self- 
contained segment from Proposition 2.29 through Proposition 2.33’ on the role 
of separability in the structure of tensor products of algebras. This material is the 
part of Sections 4—6 that is used in the remainder of the chapter to analyze finite- 
dimensional associative division algebras over fields. Two easy consequences of 
this analysis are Wedderburn’s Theorem that every finite division ring is com- 
mutative and Frobenius’s Theorem that the only finite-dimensional associative 
division algebras over R are R, C, and the algebra H of quaternions, up to R 
isomorphism. 

Chapter III introduces the Brauer group to parametrize the isomorphism classes 
of finite-dimensional associative division algebras whose center is a given field. 
Sections 2—3 exhibit an isomorphism of a relative Brauer group with what turns 


Guide for the Reader Xxi 


out to be a cohomology group in degree 2. This development runs parallel to 
the theory of factor sets for groups as in Chapter VII of Basic Algebra, and 
some familiarity with that theory can be helpful as motivation. The case that the 
relative Brauer group is cyclic is of special importance, and the theory is used in 
the problems to construct examples of division rings that would not have been 
otherwise available. The chapter makes use of material from Chapter X of Basic 
Algebra on the tensor product of algebras and on complexes and exact sequences. 

Chapter IV is about homological algebra, with emphasis on connecting homo- 
morphisms, long exact sequences, and derived functors. All but the last section is 
done in the context of “good” categories of unital left R modules, R being a ring 
with identity, where it is possible to work with individual elements in each object. 
The reader is expected to be familiar with some example for motivation; this can 
be knowledge of cohomology of groups at the level of Section HL.5, or it can be 
some experience from topology or from the cohomology of Lie algebras as treated 
in other books. Knowledge of complexes and exact sequences from Chapter X 
of Basic Algebra is prerequisite. Homological algebra properly belongs in this 
book because it is fundamental in topology and complex analysis; in algebra 
its role becomes significant just beyond the level of the current book. Important 
applications are not limited in practice to “good” categories; “sheaf” cohomology 
is an example with significant applications that does not fit this mold. Section 8 
sketches the theory of homological algebra in the context of “abelian” categories. 
In this case one does not have individual elements at hand, but some substitute is 
still possible; sheaf cohomology can be treated in this context. 

Chapters V and VI are an introduction to algebraic number theory. The theory 
of Dedekind domains from Chapters VIII and IX of Basic Algebra is taken as 
known, along with knowledge of the ingredients of the theory — Noetherian rings, 
integral closure, and localization. Both chapters deal with three theorems—the 
Dedekind Discriminant Theorem, the Dirichlet Unit Theorem, and the finiteness 
of the class number. Chapter V attacks these directly, using no additional tools, 
and it comes up a little short in the case of the Dedekind Discriminant Theorem. 
Chapter VI introduces tools to get around the weakness of the development in 
Chapter V. These tools are valuations, completions, and decompositions of tensor 
products of fields with complete fields. Chapter VI makes extensive use of metric 
spaces and completeness, and compactness plays an important role in Sections 
9-10. As noted in remarks with Proposition 6.7, Section VI.2 takes for granted 
that Theorem 8.54 of Basic Algebra about extensions of Dedekind domains does 
not need separability as a hypothesis; the actual proof of the improved theorem 
without a hypothesis of separability is deferred to Section VII.3. 

Chapter VII supplies additional background needed for algebraic geometry, 
partly from field theory and partly from the theory of commutative rings. Knowl- 
edge of Noetherian rings is needed throughout the chapter. Sections 4—5 assume 


XXii Guide for the Reader 


knowledge of localizations, and the indispensable Corollary 7.14 in Section 3 
concerns Dedekind domains. The most important result is the Nullstellensatz 
in Section 1. Transcendence degree and Krull dimension in Sections 2 and 4 
are tied to the notion of dimension in algebraic geometry. Zariski’s Theorem 
in Section 5 is tied to the notion of singularities; part of its proof is deferred to 
Chapter X. The material on infinite Galois groups in Section 6 has applications 
to algebraic number theory and algebraic geometry but is not used in this book 
after Chapter VII; compact topological groups play a role in the theory. 

Chapters VIII—X introduce algebraic geometry from three points of view. 
Chapter VIII approaches it as an attempt to understand solutions of simulta- 
neous polynomial equations in several variables using module-theoretic tools. 
Chapter IX approaches the subject of curves as an outgrowth of the complex- 
analysis theory of compact Riemann surfaces and uses number-theoretic methods. 
Chapter X approaches its subject matter geometrically, using the field-theoretic 
and ring-theoretic tools developed in Chapter VII. All three chapters assume 
knowledge of Section VII.1 on the Nullstellensatz. 

Chapter VIII is in three parts. Sections 1-4 are relatively elementary and 
concern the resultant and preliminary forms of Bezout’s Theorem. Sections 
5-6 concern intersection multiplicity for curves and make extensive use of lo- 
calizations; the goal is a better form of Bezout’s Theorem. Sections 7-10 
are independent of Sections 5-6 and introduce the theory of Grdbner bases. 
This subject was developed comparatively recently and lies behind many of the 
symbolic manipulations of polynomials that are possible with computers. 

Chapter IX concerns irreducible curves and is in two parts. Sections 1—3 define 
divisors and the genus of such a curve, while Sections 4—5 prove the Riemann— 
Roch Theorem and give applications of it. The tool for the development is discrete 
valuations as in Section VI.2, and the parallel between the theory in Chapter VI 
for algebraic number fields and the theory in Chapter IX for curves becomes more 
evident than ever. Some complex analysis is needed to understand the motivation 
in Sections | and 4. 

Chapter X largely concerns algebraic sets defined as zero loci over an alge- 
braically closed field. The irreducible such sets are called varieties. Sections 1-3 
are concerned with algebraic sets and their dimension, Sections 4—6 treat maps 
between varieties, and Sections 7-8 deal with finer questions. Sections 9-12 
are independent of Sections 6—8 and do two things simultaneously: they tie the 
theoretical work on dimension to the theory of Grobner bases in Chapter VIII, 
making dimension computable, and they show how the dimension of a zero locus 
is affected by adding one equation to the defining system. The chapter concludes 
with an introductory section about schemes, in which the underlying algebraically 
closed field is replaced by a commutative ring with identity. The entire chapter 
assumes knowledge of elementary point-set topology. 


NOTATION AND TERMINOLOGY 


This section contains some items of notation and terminology from Basic Algebra 
that are not necessarily reviewed when they occur in the present book. A few 


results are mentioned as well. The items are grouped by topic. 


Set theory 

€ membership symbol 

#S or |S| number of elements in S 

© empty set 

{x € E | P} the set of x in F such that P holds 
EC complement of the set E 

EUF, ENF, E-F union, intersection, difference of sets 
Nip FP Se Gl ire union, intersection of the sets E, 
ECF, EDF containment 

| DE ay Seer proper containment 

(a1,..-,Qy) ordered n-tuple 

{a1,..., An} unordered n-tuple 

fiEo Fi, xpw f(x) function, effect of function 

fogor fg, f | e composition of f following g, restriction to E 
fC.y) the functionx h f(x, y) 

f(E), fo\(E) direct and inverse image of a set 


in one-one correspondence 
countable 

QA 

Number systems 

bij 

(;) 


n positive, n negative 


matched by a one-one onto function 
finite or in one-one correspondence with integers 
set of all subsets of A 


Kronecker delta: 1 ifi = j, 0 ifi A j 
binomial coefficient 
n>0O,n <0 


Z, Q, R, C integers, rationals, reals, complex numbers 
max, min maximum/minimum of finite subset of reals 
[x] greatest integer < x if x is real 

Re z, Imz real and imaginary parts of complex z 

Z complex conjugate of z 

[z| absolute value of z 


XXIli 


XXIV 


Notation and Terminology 


Linear algebra and elementary number theory 


FF” 
ej 
Vv’ 
dimp V or dim V 


space of n-dimensional column vectors 
j"™ standard basis vector of F” 

dual vector space of vector space V 
dimension of vector space V over field F 
zero vector, matrix, or linear mapping 
identity matrix or linear mapping 
transpose of A 
determinant of A 
matrix with (i, j)"" entry Mj; 

matrix of L relative to domain ordered basis I" 
and range ordered basis A 

dot product 

is isomorphic to, is equivalent to 

integers modulo a prime p, as a field 

greatest common divisor 

is congruent to 

Euler’s g function 


Groups, rings, modules, and categories 


Minn (R) 

M,(R) 

unital left R module 
Hom,r(M, N) 
Endr(M) 

ker g, image g 
H"(G,N) 


simple left R module 
semisimple left R module 


Obj(C) 
Morphe(A, B) 


additive identity in an abelian group 
multiplicative identity in a group or ring 

is isomorphic to, is equivalent to 

cyclic group of order m 

invertible element in ring R with identity 
group of units in ring R with identity 

space of column vectors with entries in ring R 
opposite ring to R withaob=ba 

m-by-n matrices with entries in R 

n-by-n matrices with entries in R 

left R module M with 1m = m for allm € M 
group of R homomorphisms from M into N 
ring of R homomorphisms from M into M 
kernel and image of g 

n cohomology of group G with coefficients 
in abelian group N 

nonzero unital left R module with no proper 
nonzero R submodules 

sum (= direct sum) of simple left R modules 
class of objects for category C 

set of morphisms from object A to object B 


Notation and Terminology XXV 


Groups, rings, modules, and categories, continued 


l, 
C S 
product of {X5}ses 


coproduct of {Xs} ses 


CoPpP 


identity morphism on A 

category of S-tuples of objects from Obj(C) 
(X, {ps}ses) such that if A in Obj(C) and 
{gs € Morphe(A, X;)} are given, then there 
exists a unique g € Morphe(A, X) with 
Ps? = @y for all s 

(X, {is}ses) such that if A in Obj(C) and 
{gs € Morphe(X;, A)} are given, then there 
exists a unique g € Morphce(X, A) with 
gis = @s forall s 

category opposite to C 


Commutative rings R with identity and factorization of elements 


identity 

ideal J = (rj,...,1n) 
prime ideal / 

integral domain 

R/I with I prime 
GL(n, R) 


Chinese Remainder Theorem J), .. 


Nakayama’s Lemma 


algebra A over R 
RG 

R[X,, oes, Xn] 
R[x, Se Xn] 


irreducible element r ~ 0 
prime element r 4 0 


irreducible vs. prime 


GCD 


denoted by 1, allowed to equal 0 

ideal generated by r1,...,7n 

proper ideal with ab € I implyinga € Jorbe I 
R with no zero divisors and with 1 4 0 

always an integral domain 

group of invertible n-by-n matrices, entries in R 
., 1, given ideals with J;+J; = R fori ¥ j. 
Then the natural map g : R > Tj R/T; yields 
isomorphism R/(Yj_, 1) = R/N x +++ x R/In 
of rings. Also (Vj Jj = N+ ++ In. 

If J is an ideal contained in all maximal ideals 
and M is a finitely generated unital R module 
with 1M = M, then M = 0. 

unital R module with an R bilinear multiplication 
A x A — A. In this book nonassociative 
algebras appear only in Chapter IT, and each 
associative algebra has an identity. 

group algebra over R for group G 

polynomial algebra over R with n indeterminates 
R algebra generated by x1, ..., Xn 

r ¢ R* such that r=ab implies a € R* or be R* 
r € R* such that whenever r divides ab, then 

r divides a or r divides b 

prime implies irreducible; in any unique 
factorization domain, irreducible implies prime 
greatest common divisor in unique factorization 
domain 


XXVi Notation and Terminology 


Fields 

Fy a finite field with g = p” elements, p prime 
K/F an extension field K of a field F 

[K : F] degree of extension K /F, i.e., dimp K 
K(X1,..., Xn) field of fractions of K[X1,..., Xn] 
K(x1,..., Xn) field generated by K and x1, ..., Xn 
number field finite-dimensional field extension of Q 
Gal(K/F) Galois group, automorphisms of K fixing F 
Nx r(-) and Trx;pr(-) norm and trace functions from K to F 


Tools for algebraic number theory and algebraic geometry 


Noetherian R commutative ring with identity whose ideals 
satisfy the ascending chain condition; has the 
property that any R submodule of a finitely 
generated unital R module is finitely generated. 

Hilbert Basis Theorem R nonzero Noetherian implies R[X] Noetherian 

Integral closure 

Situation: R = integral domain, F = field of fractions, K /F = extension field. 


x € K integral over R x is a root of a monic polynomial in R[X] 
integral closureof Rin K — setofx € K integral over R, is a ring 

R integrally closed R equals its integral closure in F 

Localization 

Situation: R = commutative ring with identity, S = multiplicative system in R. 
SR localization, pairs (7, s) withr € R ands e€ S, 


modulo (r, s) ~ (r’, s’) if t(rs’ — sr’) =0 
for some t € S 


property of S~'R I +> S~'T is one-one from set of ideals J in R 
of form J = RN J onto set of ideals in S~'R 

local ring commutative ring with identity having a unique 
maximal ideal 

Rp for prime ideal P localization with S = complement of P in R 

Dedekind domain Noetherian integrally closed integral domain in 


which every nonzero prime ideal is maximal, has 
unique factorization of nonzero ideals as product 
of prime ideals 

Dedekind domain extension R Dedekind, F field of fractions, K /F finite 
separable extension, T integral closure of R in K. 
Then T is Dedekind, and any nonzero prime ideal 
gin R has oR =[];_, P.” for distinct prime 
ideals P; with P; N R = go. These have ae éi fi 
=[K : F], where f; = [T/P; : R/g]. 


CHAPTER I 


Transition to Modern Number Theory 


Abstract. This chapter establishes Gauss’s Law of Quadratic Reciprocity, the theory of binary 
quadratic forms, and Dirichlet’s Theorem on primes in arithmetic progressions. 

Section | outlines how the three topics of the chapter occurred in natural sequence and marked 
a transition as the subject of number theory developed a coherence and moved toward the kind of 
algebraic number theory that is studied today. 

Section 2 establishes quadratic reciprocity, which is a reduction formula providing a rapid method 
for deciding solvability of congruences x7 = m mod p for the unknown x when p is prime. 

Sections 3-5 develop the theory of binary quadratic forms ax? + bxy + cy*, where a, b,c are 
integers. The basic tool is that of proper equivalence of two such forms, which occurs when the two 
forms are related by an invertible linear substitution with integer coefficients and determinant 1. The 
theorems establish the finiteness of the number of proper equivalence classes for given discriminant, 
conditions for the representability of primes by forms of a given discriminant, canonical representa- 
tives of the finitely many proper equivalence classes of a given discriminant, a group law for proper 
equivalence classes of forms of the same discriminant that respects representability of integers by 
the classes, and a theory of genera that takes into account inequivalent forms whose values cannot 
be distinguished by linear congruences. 

Sections 6—7 digress to leap forward historically and interpret the group law for proper equivalence 
classes of binary quadratic forms in terms of an equivalence relation on the nonzero ideals in the 
ring of integers of an associated quadratic number field. 

Sections 8-10 concern Dirichlet’s Theorem on primes in arithmetic progressions. Section 8 
discusses Euler’s product formula for )>°°_, n~* and shows how Euler was able to modify it to 
prove that there are infinitely many primes 4k + 1 and infinitely many primes 4k + 3. Section 9 
develops Dirichlet series as a tool to be used in the generalization, and Section 10 contains the proof 
of Dirichlet’s Theorem. Section 8 uses some elementary real analysis, and Sections 9-10 use both 
elementary real analysis and elementary complex analysis. 


1. Historical Background 


The period 1800 to 1840 saw great advances in number theory as the subject 
developed a coherence and moved toward the kind of algebraic number theory that 
is studied today. The groundwork had been laid chiefly by Euclid, Diophantus, 
Fermat, Euler, Lagrange, and Legendre. Some of what those people did was 
remarkably insightful for its time, but what collectively had come out of their 
labors was more a collection of miscellaneous results than an organized theory. 
It was Gauss who first gave direction and depth to the subject, beginning with 


1 


2; I. Transition to Modern Number Theory 


his book Disquisitiones Arithmeticae in 1801. Dirichlet built on Gauss’s work, 
clarifying the deeper parts and adding analytic techniques that pointed toward 
the integrated subject of the future. This chapter concentrates on three jewels of 
classical number theory — largely the work of Gauss and Dirichlet—that seem on 
the surface to be only peripherally related but are actually a natural succession 
of developments leading from earlier results toward modern algebraic number 
theory. To understand the context, it is necessary to back up for a moment. 

Diophantine equations in two or more variables have always lain at the heart of 
number theory. Fundamental examples that have played an important role in the 
development of the subject are ax*--bxy+cy* = m for unknown integers x and y; 
c +x5 +x} +x} = m for unknown integers x1, x2, X3, X43 y? = x(x —1)(x+1) 
for unknown integers x and y; and x” + y” = z” for unknown integers x, y, z. 

In every case one can get an immediate necessary condition on a solution by 
writing the equation modulo some integer n. The necessary condition is that 
the corresponding congruence modulo n have a solution. For example take the 
equation x? + y? = p, where p is a prime, and let us allow ourselves to use 
the more elementary results of Basic Algebra. Writing the equation modulo 
p leads to x* + y? = O mod p. Certainly x cannot be divisible by p, since 
otherwise y would be divisible by p, x? and y? would be divisible by p”, and 
x? + y? = p would be divisible by p?, contradiction. Thus we can divide, 
obtaining 1 + (yx~!)? = 0 mod p. Hence z* = —1 mod p for z = xy™!. If p 
is an odd prime, then —1 has order 2, and the necessary condition is that there 
exist some z in F> whose order is exactly 4. Since F is cyclic of order p — 1, 
the necessary condition is that 4 divide p — 1. 

Using a slightly more complicated argument, we can establish conversely that 
the divisibility of p — 1 by 4 implies that x? + y* = p is solvable for integers 
x and y. In fact, we know from the solvability of z> = —1 mod p that there 
exists an integer r such that p divides r? + 1. Consider the possibilities in the 
integral domain Z[i] of Gaussian integers, where i = /—1. It was shown in 
Chapter VIII of Basic Algebra that Z[i] is Euclidean. Hence Z[i] is a principal 
ideal domain, and its elements have unique factorization. If p remains prime in 
Z{i], then the fact that p divides (r + 1)(r — i) implies that p divides r + i or 

1 


r —i in Z[i]. Then at least one of i + is and : is would have to be in Z[i]. 


Since is is not in Z[i], this divisibility does not hold, and we conclude that p 
does not remain prime in Z[i]. If we write p = (a + bi)(c + di) nontrivially, 
then p? = |a+ bil?|c + di|* = (a* + b?)(c? +d’) as an equality in Z, and we 
readily conclude that a” + b* = p. 

This much argument solves the Diophantine equation x? + y” = p for p prime. 
For p replaced by a general integer m, we use the identity 


(x? + y?) (x2 + y3) = Crime — yiy2)* + Cyn + 2291)", 


1. Historical Background 3 


which has been known since antiquity, and we see that x? + y* = m is solvable 
if m is a product of odd primes of the form 4k + 1. It is solvable also if m = 2 
and if m = p? for any prime p. Thus x? + y? = m is solvable whenever m is a 
positive integer such that each prime of the form 4k + 3 dividing m divides m an 
even number of times. Using congruences modulo prime powers, we see that this 
condition is also necessary, and we arrive at the following result; historically it 
had already been asserted as a theorem by Fermat and was subsequently proved 
by Euler, albeit by more classical methods than we have used. 


Proposition 1.1. The Diophantine equation x*-+y* = m is solvable in integers 
x and y for a given positive integer m if and only if every prime number p = 4k+3 
dividing m occurs an even number of times in the prime factorization of m. 


The first step in the above argument used congruence information; we had 
to know the primes p for which z* = —1 mod p is solvable. The second step 
was in two parts—both rather special. First we used specific information about 
the nature of factorization in a particular ring of algebraic integers, namely Z[i]. 
Second we used that the norm of a product is the product of the norms in that 
same ring of algebraic integers. 

It is too much to hope that some recognizable generalization of these steps with 
x? + y* = mcan handle all or most Diophantine equations. At least the first step 
is available in complete generality, and indeed number theory — both classical and 
modern—deduces many helpful conclusions by passing to congruences. There 
is the matter of deducing something useful from a given congruence, but doing 
so is a finite problem for each prime. Like some others before him, Gauss set 
about studying congruences systematically. Linear congruences are easy and had 
been handled before. Quadratic congruences are logically the next step. The 
first jewel of classical number theory to be discussed in this chapter is the Law 
of Quadratic Reciprocity of Gauss, which appears below as Theorem 1.2 and 
which makes useful deductions possible in the case of quadratic congruences. In 
effect quadratic reciprocity allows one to decide easily which integers are squares 
modulo a prime p. Euler had earlier come close to finding the statement of this 
result, and Legendre had found the exact statement without finding a complete 
proof. Gauss was the one who gave the first complete proof. 


Part of the utility of quadratic reciprocity is that it helps one to attack quadratic 
Diophantine equations more systematically. The second jewel of classical number 
theory to be discussed in this chapter is the body of results concerning representing 
integers by binary quadratic forms ax” + bxy+cy* = m that do not degenerate in 
some way. Lagrange and Legendre had already made advances in this theory, but 
Gauss’s own discoveries were decisive. Dirichlet simplified the more advanced 
parts of the theory and investigated an aspect of it that Gauss had not addressed 


4 I. Transition to Modern Number Theory 


and that would lead Dirichlet to his celebrated theorem on primes in arithmetic 
progressions. ! 

Lagrange had introduced the notion of the discriminant of a quadratic form 
and a notion of equivalence of such forms —two forms of the same discriminant 
being equivalent if one can be obtained from the other by a linear invertible 
substitution with integer entries. Equivalence is important because equivalent 
forms represent the same numbers. He established also a theory of reduced forms 
that specifies representatives of each equivalence class. For an odd prime p, 
ax* + bxy + cy? = p is solvable only if the discriminant b” — 4ac is a square 
modulo p, and Lagrange was hampered by not knowing quadratic reciprocity. 
But he did know some special cases, such as when 5 is a square modulo p, and he 
was able to deal completely with discriminant —20. For this discriminant, there 
are two equivalence classes, represented by x? + 5y? and 2x” + 2xy + 3y’, and 
Lagrange showed for primes p other than 2 and 5 that 


x? +4+5y? =p is solvable if and only if p =1or9 mod 20, 
2x? + Ixy +3y? = p is solvable if and only if p =3or7 mod 20; 


the fact about x? + 5y? = p had been conjectured earlier by Euler. Lagrange 
observed further that 


(2x7 + 2x1 91 + 3yp) (2x5 + 2x2y2 + 3y3) 
= (2xpx2 + x1 92 + yix2 + 3y1y2)? + 50012 — Yr)’, 


from which it follows that the product of two primes congruent to 3 or 7 modulo 
20 is representable as x” + 5y”; this fact had been conjectured by Fermat. 

Legendre added to this investigation the correct formula for quadratic reci- 
procity, which he incorrectly believed he had proved, and many of its conse- 
quences for representability of primes by binary quadratic forms. In addition, 
he tried to develop a theory of composition of forms that generalizes Lagrange’s 
identity above, but he had only limited success. 

In addition to establishing quadratic reciprocity, Gauss introduced the vital no- 
tion of “proper equivalence” for forms ax*+bxy+cy? of the same discriminant — 
two forms of the same discriminant being properly equivalent if one can be 
obtained from the other by a linear invertible substitution with integer entries 
and determinant +1. In terms of this definition, he settled the representability 
of primes by binary quadratic forms, he showed that there are only finitely many 
proper equivalences classes for each discriminant, and he gave an algorithm for 


'These matters are affirmed in Dirichlet’s Lectures on Number Theory. The aspect that Gauss 
had not addressed and that provided motivation for Dirichlet is the value of the “Dirichlet class 
number” /(D) defined below. 


1. Historical Background 5 


deciding whether two forms are properly equivalent. The main results of Gauss in 
this direction appear as Theorems 1.6 and 1.8 below. In addition, Gauss showed, 
without the benefit of having a definition of “group,” in effect that the set of 
proper equivalence classes of forms with a given discriminant becomes a finite 
abelian group in a way that controls representability of nonprime integers; by 
contrast, Lagrange’s definition of equivalence does not lead to a group structure. 
Gauss’s main results in this direction, as recast by Dirichlet, appear as Theorem 
1.12 below. 

The story does not stop here, but let us pause for a moment to say what La- 
grange’s theory, as amended by Gauss, says for the above example, first rephrasing 
the context in more modern terminology. We saw earlier that unique factorization 
in the ring Z[i] of Gaussian integers is the key to the representation of integers 
by the quadratic form x? + y”. For a general quadratic form ax? + bxy + cy? 
with discriminant D = b* — 4ac, properties of the ring R of algebraic integers in 
the field Q(/D ) are relevant for the questions that Gauss investigated. It turns 
out that R is a principal ideal domain if Gauss’s finite abelian group of proper 
equivalence classes is trivial and that when D is “fundamental,” there is a suitable 
converse.” 

With the context rephrased we come back to the example. Consider the 
equation x*+5y* = p for primes p. The discriminant of x* +5y? is —20, and the 
relevant ring of algebraic integers is Z[,/—5 ], which is not a unique factorization 
domain. Thus the argument used with x7 + y? = p does not apply, and we 
have no reason to expect that solvability of x7 + 5y* = 0 mod p is sufficient for 
solvability of x7 -+5y* = p. Let us look more closely. The congruence condition 
is that —20 is a square modulo p. Thus —5 is to be a square modulo p. If we 
leave aside the primes p = 2 and p = 5 that divide 20, the Law of Quadratic 
Reciprocity will tell us that the necessary congruence resulting from solvability 
of x* + 5y* = p is that p be congruent to 1, 3, 7, or 9 modulo 20. However, we 
can compute all residues n of x* + 5y* modulo 20 for n with GCD(n, 20) = 1 to 
see that 


x? 4+5y?=lor9mod20 if GCD(x? + 5y’, 20) = 1. 


Meanwhile, the form 2x? + 2xy + 3y? has discriminant —20, and we can check 
that solvability of 2x? + 2xy + 3y* = p leads to the conclusion that 


2x? + 2xy +3y? =30r7mod20 if GCD(2x* + 2xy + 3y”, 20) = 1. 


Lagrange’s theory easily shows that representability of integers by a form depends 
only on the equivalence class of the form and that all primes congruent to 1, 3, 


2In each of the situations (a) and (b) of Proposition 1.17 below, R is a principal ideal domain 
only if Gauss’s group is trivial. In all other cases, Gauss’s group is nontrivial, and R is a principal 
ideal domain only if the group has order 2. 


6 I. Transition to Modern Number Theory 


7, or 9 modulo 20 are representable by some form. This example is special 
in that equivalence and proper equivalence come to the same thing. Gauss’s 
multiplication rule for proper equivalence classes of forms with discriminant 
—20 produces a group of order 2, with x7 + Sy representing the identity class 
and 2x* + 2xy + 3y representing the other class. Consequently 


p = 1or9 mod 20 implies x’ +5y? =p solvable, 
p =3 or7 mod 20 implies Dx? 2xy + 3y? =p solvable. 


In addition, the multiplication rule has the property that if m is representable by 
all forms in the class of ax? + bixy+c, y? and n is representable by all forms 
in the class of aax? + box yto y: then mn is representable by all forms in the 
class of the product form. It is not necessary to have an explicit identity for the 
multiplication. Thus, for example, it follows without further argument that if p 
and q are primes congruent to 3 or 7 modulo 20, then x? + Sy” = pq is solvable. 

Let us elaborate a little about the rephrased context for Gauss’s theory. We let 
D be the discriminant of the binary quadratic forms in question, and we assume 
that D is “fundamental.” Let R be the ring of algebraic integers that lie in the 
field Q(./D ). It turns out to be possible to define a notion of “strict equivalence” 
on the set of ideals of R in such a way that multiplication of ideals descends to a 
multiplication of strict equivalence classes. The strict equivalence classes of ideals 
then form a group, and this group is isomorphic to Gauss’s group. In particular, 
one obtains the nonobvious conclusion that the set of strict equivalence classes 
of ideals is finite. The main result giving this isomorphism is Theorem 1.20. 
This rephrasing of the theory points to a generalization to algebraic number fields 
of degree higher than 2 and is a starting point for modern algebraic number theory. 

Now we return to the work of Gauss. Even the example with D = —20 that was 
described above does not give an idea of how complicated matters can become. 
For discriminant —56, for example, the two forms x7+ 14y? and 2x*+7y? take on 
the same residues modulo 56 that are prime to 56, but no prime can be represented 
by both forms. These two forms and the forms 3x? + 2xy + Sy represent the 
four proper equivalence classes. By contrast, there are only three equivalence 
classes in Lagrange’s sense, and we thus get some insight into why Legendre 
encountered difficulties in defining a useful multiplication even for D = —S56. 
Gauss’s theory goes on to address the problem that x” + 14y* and 2x? + 7y? take 
on one set of residues modulo 56 and prime to 56 while 3x* + 2xy + 5y? take 
on a disjoint set of such residues. Gauss defined a “genus” (plural: “genera’’) 
to consist of proper equivalence classes like these that cannot be distinguished 
by linear congruences, and he obtained some results about this notion. Gauss’s 
set of genera inherits a group structure from the group structure on the proper 
equivalence classes of forms, and the group structure for the genera enables one 
to work with genera easily. 


1. Historical Background 7 


The third jewel of classical number theory to be discussed in this chapter is 
Dirichlet’s celebrated theorem on primes in arithmetic progressions, given below 
as Theorem 1.21. The statement is that if m and b are positive relatively prime 
integers, then there are infinitely many primes of the form km +b with k a positive 
integer. The proof mixes algebra, a little real analysis, and some complex analysis. 

What is not immediately apparent is how this theorem fits into a natural 
historical sequence with Gauss’s theory of binary quadratic forms. In fact, the 
statement about primes in arithmetic progressions was thrust upon Dirichlet in at 
least two ways. Dirichlet thoroughly studied the work of those who came before 
him. One aspect of that work was Legendre’s progress toward obtaining quadratic 
reciprocity; in fact, Legendre actually had a proof of quadratic reciprocity except 
that he assumed the unproved result about primes in arithmetic progressions for 
part of it and argued in circular fashion for another part of it. Another aspect 
of the work Dirichlet studied was Gauss’s theory of multiplication of proper 
equivalence classes of forms, which Dirichlet saw a need to simplify and explain; 
indeed, a complete answer to the representability of composite numbers requires 
establishing theorems about genera beyond what Gauss obtained and has to make 
use of the theorem about primes in arithmetic progressions. 

In addition, Dirichlet asked and settled a question about proper equivalence 
classes for which Gauss had published nothing and for which Jacobi had conjec- 
tured an answer: How many such classes are there for each discriminant D? Let 
us call this number the “Dirichlet class number,’ denoting it by h(D). Dirichlet’s 
answer has several cases to it. When D is fundamental, even, negative, and not 
equal to —4, the answer is 


h(D) = 


’ 


2,./1D/Al 3 D/4\ 1 
TU hel n n 
GCD(n,D)=1 


with the sum taken over positive integers prime to D. Here when p is a prime 
not dividing D, a) is +1 if D/4 is a square modulo p and is —1 if not. For 


Pp 
k 
general n = [| p* prime to D, (7) is the product of the expressions (Pe) 


corresponding to the factorization? of n. When D = —4, the quantity on the right 
side has to be doubled to give the correct result, and thus the formula becomes 


4 —1\1 4 (-1)@-0P 
m= yD (Ar-2  oe 
nodd>1\% 7% TF yn oag>1 ue 
The adjusted formula correctly gives h(—4) = —1, since Leibniz had shown 
more than a century earlier that 1 — ; + ; - ; +---= 4. Dirichlet was able to 


3The expression (2) is called a “Jacobi symbol.” See Problems 9-11 at the end of the chapter. 


8 I. Transition to Modern Number Theory 


evaluate the displayed infinite series for general D as a finite sum, but that further 
step does not concern us here. The important thing to observe is that the infinite 
series is always an instance of a series ar x(n)/n with x a periodic function 
on the positive integers satisfying x (m+n) = x (m) x(n). Dirichlet’s derivation 
of a series expansion for his class numbers required care because the series is only 
conditionally convergent. To be able to work with absolutely convergent series, 
he initially replaced 4 by + for s > 1, thus initially treating series he denoted by 
L(s, x) = Dp Xm) / a. 

As a consequence of this work, Dirichlet was familiar with series L(s, x) and 
was aware of the importance of expressions L(1, x), knowing that at least when 
x(n) = (2), LC, x) is not 0 because it is essentially a class number. This 
nonvanishing turns out to be the core of the proof of the theorem on primes in 
arithmetic progressions. Dirichlet would have known about Euler’s proof that 
the progressions 4n + 1 and 4n + 3 contain infinitely many primes, a proof 
that we give in Section 8, and he would have recognized Euler’s expression 
eae (—1)”/(2n + 1) as something that occurs in his formula for h(—4). Thus 
he was well equipped with tools and motivation for a proof of his theorem on 
primes in arithmetic progressions. 


2. Quadratic Reciprocity 


If p is an odd prime number and a is an integer with a € 0 mod p, the Legendre 
symbol (5) is defined by 


(<) +1 if a is a square modulo p, 
—1 if a is not a square modulo p. 


Since F’> is a cyclic group of even order, the squares form a subgroup of index 2. 


Therefore a t> (5) is a group homomorphism of F into {£1}, and we have 


(5) (°) = (%) whenever a and b are not divisible by p. 


Theorem 1.2 (Law of Quadratic Reciprocity). If p and q are distinct odd 
prime numbers, then 


(a) (=) — (—1)20-), 
Pp 


(b) (*) = (—1)8°-D, 
Pp 


(c) (2) (4) = (—1)G0-DIRG-DI, 
q) \P 


2. Quadratic Reciprocity 9 


REMARKS. Conclusion (a) is due to Fermat and says that — 1 is a square modulo 
p if and only if p = 4n + 1. We proved this result already in Section | and will 
not re-prove it here. Conclusion (b) is due to Euler and says that 2 is a square 
modulo p if and only if p = 8n + 1. Conclusion (c) is due to Gauss and says 
that if p or g is 4n + 1, then (7) = (3) and otherwise (7) = =). The proofs of 
(b) and (c) will occupy the remainder of this section. 


EXAMPLES. 


(1) This example illustrates how quickly iterated use of the theorem decides 
whether a given integer is a square. We compute (2). We have 


()=()=(%)-(@)=()--@)-+0)-@)-+ 


the successive equalities being justified by using (c), the formula (He) = (2), 


Pp 
(c) again, (=) — (5) again, the formula (5) (°) = (*) and (b), (c) once more, 
(2) = (5) once more, and an explicit evaluation of (3). 


(2) Lemma 9.46 of Basic Algebra asserts that 3 is a generator of the cyclic 
group F* when n is prime of the form 22" + 1 with N > 0, and Theorem 1.2 
enables us to give a proof. In fact, this n has n = 2 mod 3 and n = 1 mod 4. 

2 


Thus (2) — (4) = (3) = —1. Since F* is a cyclic group whose order is a power 


of 2, every nonsquare is a generator. Thus 3 is a generator. 


We prove two lemmas, give the proof of (b), prove a third lemma, and then 
give the proof of (c). 


Lemma 1.3. If p is an odd prime and a is any integer such that p does not 
divide a, then a2(?-) = (4) mod p. 

PROOF. The multiplicative group F; being cyclic, let b be a generator. Write 
a =D’ mod pforsomeintegerr. Since (5) = (-1)' anda2?-) = (p20) = 


(b2P-Dyr = (—1)’ mod p, the lemma follows. 


Lemma 1.4 (Gauss). Let p be an odd prime, and let a be any integer such that 
p does not divide a. Among the least positive residues modulo p of the integers 
d, 2a, 3a,..., $(p — 1)a, let n denote the number of residues that exceed p/2. 


Then (¢) = (—1)". 


10 I. Transition to Modern Number Theory 


PROOF. Let r;,...,7, be the least positive residues exceeding p/2, and let 
S1,..., 5% be those less than p/2, so thatn +k = $(p — 1). The residues 
T1,.-+-,Tn, S],..-, Sk are distinct, since no two of a, 2a, 3a,..., $(p — l)a differ 


by a multiple of p. Each integer p — r; is strictly between 0 and p/2, and we 
cannot have any equality p—r; = s;, sincer; +s; = p would mean that (u + v)a 
is divisible by p for some integers u and v with 1 <u,u < +( p — 1). Hence 


D—NMi5+++5 P—Tny S81,+++5 Sk 
is a permutation of 1, ..., $(p — 1). Modulo p, we therefore have 
1-2---S(p—V S (HD rasp ++ 5K 
= (-1)"a-2a--- $(p —l)a 
= (—1)"a2zP-Dy ; 2-+-4(p 1) 


and cancellation yields qi?-) = (—1)" mod p. The result follows by combin- 
ing this congruence with the conclusion of Lemma 1.3. 


PROOF OF (b) IN THEOREM 1.2. We shall apply Lemma 1.4 with a = 2 after 
investigating the least positive residues of 2, 4, 6,..., p—1. Wecan list explicitly 
those residues that exceed p/2 for each odd value of p mod 8 as follows: 


p=8k+1, 4k+2,4k+4,..., 8k, 

p=8k+3, 4k+2,4k+4,...,8k+2, 
p=8k+5, 4k+4,...,8k+2,8k4+4, 
p=8k+7, 4k+4,...,8k+4,8k+6. 


If n denotes the number of such residues for a given p, a count of each line of the 
above table shows that 


n = 2k and (—1)"=+41 for p=8k+1, 
n=2k+1 and (-1l)"=-1 for p = 8k +3, 
n=2k+1 and (-1l)"=-1 for p=8k+5, 
n=2k+2 and (-1)"=+4+1 for p=8k+7. 


Thus Lemma 1.4 shows that (5) = +1 for p = 8k + 1 and (5) = —1 for 
p = 8k +3. This completes the proof of (b). 


2. Quadratic Reciprocity 11 


Lemma 1.5. If p is an odd prime and a is a positive odd integer such that p 
7(P—1) 
2 
does not divide a, then (5) =(—1)',wheret = )}° [ua/p]. Here [-] denotes 
wl 
the greatest-integer function. 


REMARKS. When a = 2, the equality (5) = (—1)' fails for p = 3, since 
t = [2/3] =0. 


PROOF. With notation as in Lemma 1.4 and its proof, we form each wa for 1 < 
u< 5 (p—1) and reduce modulo p, obtaining as least positive residue either some 
rj fori <n orsome s; for j < k. Then ua/p = [ua/p] + p '(some r; or aps 
Hence 


5(p-1) s(p-l) i k 
yO wa de plua] pl Da Dj: (*) 
u=1 u=1 i=!1 j=l 
The proof of Lemma 1.4 showed that p—r,,..., D—Tn, 51, .--, SgiSapermutation 
of 1,..., $(p — 1), and thus the sum is the same in the two cases: 


53(p-l) n 


k n k 
b= Pn) sy np — Da Ds. 
u=1 i=1 j=l i=1 j=! 


Subtracting this equation from (*), we obtain 


3(p-1) 3(p-1) 


(a —1) » u = p( es [ua/p]—n) +2 on. 


u=1 =1 


Lop 
Replacing ares ” on the left side by its value (p*—1) and taking into account 
that p is odd, we obtain the following congruence modulo 2: 


5(p-1) 
(a—1)3(p? —1) = YS [wa/p]—n mod 2. 


u=1 


Since a is odd, the left side is congruent to 0 modulo 2. Therefore n = 
L(p— 
ee v [ua/p] =t mod 2, and Lemma 1.4 allows us to conclude that (—1)’ = 


u=1 
Kit = (2) 


P 


PROOF OF (c) IN THEOREM 1.2. Let 


S={@,y)eZx Z| l<x<}(p—Dandi<y<5(q—-D}, 


12 I. Transition to Modern Number Theory 


the number of elements in question being |S| = i( p—1)(g — 1). Wecan write 
S = S; U Sp disjointly with 


Si ={(,y)|qx > py} and So = {(x, y) | qx < py}; 


the exhaustion of S by S; and S2 follows because gx = py would imply that 
p divides qx and hence that p divides x, contradiction. We can describe S; 
alternatively as 


Si ={@,y)|1 <x 3(p—D andl < y <qx/p}, 


1 
3(p-l) 
x=1 


and therefore |S,| = }> 


Pia 
that (—1)' = (4). Similarly we have | S| = ee " [py/q], which is the integer 


t in Lemma 1.5 such that (—1)' = (), Therefore 


[qgx/p], which is the integer t in Lemma 1.5 such 


(—1)#-D@-) = (—1)/5! = (—1)/Stl(— 12! = (2) (4), 


and the proof is complete. 


3. Equivalence and Reduction of Quadratic Forms 


A binary quadratic form over Z is a function F (x, y) = ax? + bxy +cy* from 
Z x Zto Z with a, b, cin Z. Following Gauss,* we abbreviate this F as (a, b, c). 
We shall always assume, without explicitly saying so, that the discriminant 
D = b? — 4ac is not the square of an integer and that F is primitive in the sense 
that GCD(a, b, c) = 1. When there is no possible ambiguity, we may say “form” 
or “quadratic form” in place of “binary quadratic form.” 


Let iG a be a member of the group GL(2, Z) of integer matrices whose 


inverse is an integer matrix. The determinant of such a matrix is +1. We can use 
this matrix to change variables, writing 


y VY 6 y’ yx! by’ : 


a(ax! + By')’ + bax! + By’\(yx' + dy’) + c(yx' + dy’)? 
= (aa?+bay +cy”)x'?+(2aaB-+bad+bpy +2cys)x' y' +(aB?+bBd+cd)y”. 


4 Disquisitiones Arithmeticae, Article 153. Actually, Gauss always assumed that the coefficient 
of xy is even and consequently wrote (a, b, c) for ax* + 2bxy + cy?. To study x? + xy + y?, for 
example, he took a = 2, b = 1, c = 2. The convention of working with ax? + bxy+ cy? is due to 
Eisenstein. 


3. Equivalence and Reduction of Quadratic Forms 13 


If we associate the triple (a, b,c) of F(x, y) to the matrix (3 be ), then this 


formula shows that the triple (a’, b’, c’) of the new form F’(x’, y’) is associated 


to the matrix 
2a’ ob’ \ f(a y 2a b a £p 
b' 2) \B 8 b Ww)\y b/)° 


From this equality of matrices, we see that 


(i) the member ( : :) of GL(2, Z) has the effect of the identity transforma- 
tion, 


(ii) the member (: 4 ) ee : .) of GL(2, Z) has the effect of applying first 


a Bp a’ p’ 
C ) and then (°, ‘ay 


These two facts say that we do not quite have the expected group action on forms 
on the left. Instead, we can say either that we have a group action on the right 
or that gF is obtained from F by operating by g‘. Anyway, there are orbits, 
and they are what we really need. The discriminant D = b” — 4ac of the form 


F is evidently minus the determinant of the associated matrix is = ), and the 


displayed equality of matrices thus implies that the discriminant of the form F’ 
is D(ad — By). Since (#5 — By)* = 1 for matrices in GL(2, Z), we conclude 
that 


(111) each member of GL(2, Z) preserves the discriminant of the form. 


Hence the group GL(2, Z) acts on the forms of discriminant D. 

Forms in the same orbit under GL(2, Z) are said to be equivalent. Forms in 
the same orbit under the subgroup SL(2, Z) are said to be properly equivalent. 
A proper equivalence class of forms will refer to the latter relation. This notion 
is due to Gauss. Equivalence under GL(2, Z) is an earlier notion due to Lagrange, 
and we shall refer to its classes as ordinary equivalence classes on the infrequent 
occasions when the notion arises. Proper equivalence is necessary later in order 
to get a group operation on classes of forms. If one form can be carried to another 
form by a member of GL(2, Z) of determinant —1, we say that the two forms are 


improperly equivalent. Use of the matrix ( ; 


improperly equivalent to the form (a, —b, c). In particular, (a, 0, c) is improperly 
equivalent to itself. 

The discriminant D is congruent to b? modulo 4 and hence is congruent to 0 
or 1 modulo 4. All nonsquare integers D that are congruent to 0 or 1 modulo 4 
arise as discriminants; in fact, we can always achieve such a D with a = | and 
with b equal either to 0 or to 1. 


Bes ) shows that the form (a, b, c) is 


The discriminant is minus the determinant of the matrix iG ; ) associated to 


C 


14 I. Transition to Modern Number Theory 


(a, b, c), and this matrix is real symmetric with trace 2(a+c). Since D = b?—Aac 
is assumed not to be the square of an integer, neither a nor c can be 0. 


If D > O, the symmetric matrix @ ) is indefinite, having eigenvalues 


of opposite sign. In this case the Dirichlet class number of D, denoted by 
h(D), is defined to be the number? of all proper equivalence classes of forms of 
discriminant D. 


If D < 0, then a and c have the same sign. The matrix i 


b 2c 
definite if a and c are positive, and it is negative definite if a and c are negative. 


Correspondingly we refer to the form (a, b, c) as positive definite or negative 


) is positive 


definite in the two cases. Since g' ( - ) g is positive definite whenever ( A ) 


is positive definite, any form equivalent to a positive definite form is again positive 
definite. A similar remark applies to negative definite forms. Thus “positive 
definite” and “negative definite” are class properties. For any given discriminant 
D < 0, the Dirichlet class number of D, denoted by h(D), is the number® of 
proper equivalence classes of positive definite forms of discriminant D. 

The form (a, b, c) represents an integer m if ax? + bxy +cy” = m is solvable 
for some integers x and y. The form primitively represents m if the x and 
y with ax? + bxy + cy” = m can be chosen to be relatively prime. In any 
event, GCD(x, y) divides m, and thus whenever a form represents a prime p, it 
primitively represents p. 


Theorem 1.6. Fix a nonsquare discriminant D. 


(a) The Dirichlet class number /(D) is finite. In fact, any form of discriminant 
D is properly equivalent to a form (a, b,c) with |b| < |a| < |c| and therefore 
has 3|ac| < |D|, and the number of forms of discriminant D satisfying all these 
inequalities is finite. 

(b) An odd prime p with GCD(D, p) = 1 is primitively representable by some 
form (a, b, c) of discriminant D if and only if (3) = +1. In this case the number 
of proper equivalence classes of forms primitively representing p is either | or 2, 
and these classes are carried to one another by GL(2, Z). In fact, if (7) = +1, 


then b* = D mod 4p for some integer b, and representatives of these classes may 
be taken to be ( p, +b, a ) 


>This number was studied by Dirichlet. According to Theorem 1.20 below, it counts the “strict 
equivalence classes” of ideals in a sense that is introduced in Section 7. This number either equals or 
is twice the number of equivalence classes of ideals in the other sense that is introduced in Section 7. 
The latter is what is generalized in Chapter V in the subject of algebraic number theory, and the latter 
is how “class number” is usually defined in modern books in algebraic number theory. Consequently 
Dirichlet class numbers sometimes are twice what modern class numbers are. We use “Dirichlet 
class numbers” in this chapter and change to the modern “class numbers” in Chapter V. 

This number was studied by Dirichlet. See the previous footnote for further information. 


3. Equivalence and Reduction of Quadratic Forms 15 


We come to the proof after some preliminary remarks and examples. The 
argument for (a) is constructive, and thus the forms given explicitly in (b) can 
be transformed constructively into properly equivalent forms satisfying the con- 
ditions of (a). Hence we are led to explicit forms as in (a) representing p. A 
generalization of (b) concerning how a composite integer m can be represented if 
GCD(D, m) = | appears in Problem 2 at the end of the chapter. What is missing 
in all this is a description of proper equivalences among the forms as in (a). We 
shall solve this question readily in Proposition 1.7 when D < 0. For D > 0, the 
answer is more complicated; we shall say what it is in Theorem 1.8, but we shall 
omit some of the proof of that theorem. 


EXAMPLES. 


(1) D = —4. Theorem 1.2a shows that the odd primes with (2) = +1 are 
those of the form 4k + 1. Theorem 1.6a says that each proper equivalence class 
of forms of discriminant —4 has a representative (a, b, c) with 3|ac| < 4. Since 
D <0, we are interested only in positive definite forms, which necessarily have 
a and c positive. Thus a = c = 1, and we must have b = 0. So there is only 
one class of (positive definite) forms of discriminant —4, namely x? + y?, and 
Theorem 1.6b allows us to conclude that x? + y* = p is solvable for each prime 
p = 4k +1. In other words, we recover the conclusion of Proposition 1.1 as far 
as representability of primes is concerned. 


(2) D = —20. To have (>) = +1 for an odd prime p, we must have either 


(|) = (?) = +1 or (=) = (°) = —1. Theorem 1.2 shows in the first case 
that p = 1 mod 4 and p = +1 mod 5, while in the second case p = 3 mod 4 
and p = +3 mod 5. That is, p is congruent to one of 1 and 9 modulo 20 in the 
first case and to one of 3 and 7 modulo 20 in the second case. Let us consider 
the forms as in Theorem 1.6a. We know that a > 0 and c > 0. The inequality 
3ac < |D| forces ac < 6. Since |b] < a < c, we obtain a* < 6 anda < 2. 
Since 4 divides D, b is even. Then b = 0 or b = +2. So the only possibilities 
are (1,0, 5) and (2, +2, 3). Because of Theorem 1.6b, any prime congruent to 
one of 1, 3, 7, 9 modulo 20 is representable either by (1, 0, 5) and not (2, +2, 3), 
or by (2, +2, 3) and not (1, 0,5). We can write down all residues modulo 20 for 
x? + 5y? and 2x? + 2xy + 3y7, and we find that the possible residues prime to 
20 are 1 and 9 in the first case, and they are 3 and 7 in the second case. The 
conclusion for odd primes p with GCD(20, p) = 1 is that 


p = 1or9 mod 20 implies p is representable as x” + Sy’, 
Pp =3or7 mod 20 implies p is representable as 2x” + 2xy + 3y”. 


The residues modulo 20 have shown that x* + 5y” is not equivalent to either of 
2x? + 2xy + 3y7, but they do not show whether 2x? + 2xy + 3y? are properly 


16 I. Transition to Modern Number Theory 


equivalent to one another. Hence the Dirichlet class number h(—20) is either 2 
or 3. It will turn out to be 2. 


(3) D = —56. To have (2) = +1 for an odd prime p, we must have an 


odd number of the Legendre symbols (=). (5), and (7) equal to +1 and the 


rest equal to —1. We readily find from Theorem 1.2 that the possibilities with 
GCD(56, p) = | are 


p =1,3,5, 9, 13, 15, 19, 23, 25, 27, 39, 45 mod 56. 


Applying Theorem 1.6a as in the previous example, we find that x* + 14y’, 
2x? + Ty, and 3x? + 2xy + Sy? are representatives of all proper equivalence 
classes of forms of discriminant —56. Taking into account Theorem 1.6b and the 
residue classes of these forms modulo 56, we conclude for odd primes p that 


if p = any of 1,9, 15, 23, 25, 39 mod 56, then 

Pp is representable as x? + 14y? or 2x7 + Ty’, 
if p = any of 3,5, 13, 19, 27, 45 mod 56, then 

Pp is representable as both of 3x? + 2xy + 5y’. 


The question left unsettled by the argument so far is whether x” + 14 is properly 
equivalent to 2x” + 7y*. Equivalent forms represent the same integers, and the 
integer | is representable by x* + 147 but not by 2x” +7y. Hence the two forms 
are not equivalent and cannot be properly equivalent. According to Theorem 
1.6b, the primes of the first line are therefore representable by either x? + 14y? or 
2x* + Ty but never by both. Hence the Dirichlet class number h(—56) is either 
3 or 4. It will turn out to be 4. 

(4) D = 5. The forms of discriminant 5 are indefinite. Applying Theorem 
1.6a, we obtain 3|ac| < 5. Hence |a| = |c| = 1. Since D is odd, b is odd. The 
inequality |b| < |a| thus forces |b] = 1. Then D = 1 — 4ac shows that ac < 0. 
The possibilities are therefore (1, +1, —1) and (—1, +1, 1). The Dirichlet class 
number /(5) is at most 4. It will turn out to be 1. Let us take this fact as known. 
The odd primes p with (3) = +1 are p = 5k +1. Under the assumption that the 
class number is 1, Theorem 1.6b shows that every such prime is representable as 
ene yr ir 


PROOF OF THEOREM 1.6a. We consider the effect of two transformations in 
SL(2, Z), one via ; - and the other via . Under these, the matrix 


associated to (a, b, c) becomes 


(4 o)(F 2) o)=(25 22) 


3. Equivalence and Reduction of Quadratic Forms 17 


said 1 0 2a b lon\_ 2a 2an+b 
n 1 b 2c 0 1) \2an+b 2an?+2bn+2c })’ 


respectively. Thus the transformations are 


(a, b,c) > (c, —b, a), (*) 
(a, b,c) > (a, 2an +b, c’). (2) 


Possibly applying () allows us to make |a| < |c| while leaving |b| alone. Since 
a # 0, we can apply (**) with n the closest integer to —£ to make |b| < |al. 
This step possibly changes c. Thus after this step, we again apply («) if necessary 
to make |a| < |c|, and we apply (*) again. In each pair of steps, we may assume 
that |b| strictly decreases or else that n = 0. We cannot always be in the former 
case, since |b| is bounded below by 0. Thus at some point we obtain n = 0. At 
this point, c does not change, and thus we have |b| < |a| < |c|, as required. 
The inequalities |b] < |a| < |c| imply that 


4lac| = |D—b*| < |D| + |b? <|D| + lacl, 
and hence 3|ac| < |D|. Since neither a nor c is 0, it follows that the inequalities 


|b| < |a| < |c| imply that |a], |b], |c| are all bounded by |D|. Therefore the 
Dirichlet class number /(D) is finite. 


PROOF OF NECESSITY IN THEOREM 1.6b. Suppose x and y are integers with 
GCD(x, y) = 1 and ax* + bxy + cy? = p. Then ax* + bxy + cy? =0 mod p. 
Choose u and v with ux + vy = 1. Routine computation shows that 

4(ax*+bxy + cy*)(av? — buv + cu’) 
= [u(xb + 2yc) — v(2xa + yb) — (b? — 4ac)(xu + yv)* 
= [u(xb + 2yc) — v(2xa yb)P (b? — 4ac), 


and hence 
0 = [u(xb + 2yc) — v2xa + yb)P — (b* — 4ac) mod Dp. 


Consequently D = [u(xb + 2yc) —vQxa+ yb)]? mod p, and D is exhibited as 
a square modulo p. 


PROOF OF SUFFICIENCY IN THEOREM 1.6b. Choose an integer solution b of 
b* = Dmod p. Since b + p is another solution and has the opposite parity, 
we may assume that b and D have the same parity. Then b? = D mod p and 
b* = D mod 4, so that b> = D mod 4p. Since GCD(D, p) = 1, p does not 


18 I. Transition to Modern Number Theory 


divide b, and the forms ( p, +b, oP) are primitive. They have discriminant 


b? — 4p ae = D, they take the value p for (x, y) = (1,0), and they are 


improperly equivalent via ( ; a ) . Thus the forms in the statement of the theorem 


exist. 
For the uniqueness suppose that a form (a, b, c) of discriminant D represents 
p, say with axe +bxoyo + cye = p. Since this representation has to be primitive, 


we know that GCD(xo, yo) = 1. Put Ga — a) and choose integers 6 


and 6 such that a6 — By = 1. Then 6 ) has determinant 1 and satisfies 


(75) (5) = (%): The equality ag + bxox0 +03 = Hoo mw) (7 2.) (S) 
therefore yields 


re OG) ay 8) (a): 


Consequently the form (a’, b’, c’) associated to the matrix « ‘ ) & bs ) & ) 


takes on the value p at (x, y) = (1, 0) and is properly equivalent to (a, b, c). In 
particular, it is a form (p, b’, c’) for some b’ and c’ such that b’? — 4pc' = D. 

Thus in the proof of uniqueness, we may assume that we have two forms 
(p, b’, c’) and (p, b”, c”) of discriminant D. Then b”? = D = b” mod 4p. The 
conditions b”? = b’ mod p and b’? = b” mod 4 imply that b” = +b’ mod p 
and b” = b’ mod 2 for one of the choices of sign. Thus b” = +b’ mod 2p for 
that choice of sign. Let us write b” = +b’ + 2np for some integer n. The matrix 
equality 


1 0 2p +b’ De, 2p 2pntbd’ 

n 1 +b’ 2c’ O 1) \2pn+d’ 2(«) 
shows that (p, +b’, c’) is properly equivalent to (p, b”, «). Since the discriminant 
has to be D, we conclude that * = c”. That is, (p, b”, c”) is properly equivalent to 


(p, +b’, c’) for that same choice of sign. Since (p, b’, c’) is improperly equivalent 
to (p, —b’, c’), the proof of the theorem is complete. 


Our discussion of representability of primes p by binary quadratic forms 
of discriminant D when GCD(D, p) = 1 will be complete once we have a 
set of representatives of proper equivalence classes with no redundancy. For 
discriminant D < 0, this step is not difficult and amounts, according to Theorem 
1.6a, to sorting out proper equivalences among forms (a, b, c) with b” —4ac = D 
and |b| < |a| < |c|. Let us call a form with D < 0 reduced when it satisfies 
these conditions. 


3. Equivalence and Reduction of Quadratic Forms 19 


There are two redundancies that are easy to spot, namely 


(a, b, a) is properly equivalent to (a, —b, a) via ( 


(a, a, c) is properly equivalent to (a, —a, c) via (6 


The result for D < 0 is that there are no other redundancies among reduced 
forms. 


Proposition 1.7. Fix a negative discriminant D. With the exception of the 
proper equivalences of 


(a,b,a) to (a,—b,a) 


and (a,a,c) to (a,—a,c), 


no two distinct reduced positive definite forms of discriminant D are properly 
equivalent. 


PROOF. Suppose that (a, b, c) is properly equivalent to (a’, b’, c’), that both 
are reduced, and that a > a’ > 0. For some 6 ) in SL(2, Z), we have 


a’ = aa? + bay +cy?. Hence the inequalities c > a and |b| > —a imply that 
2 2 B12 a: 
a > aa’ +bay+cy* > a(a*+y*)+bay > a(a*+y*)—alay| > alay|, (*) 


and ay equals 0 or +1. Thus the ordered pair (a, y) is one of (0, +1), (+1, 0), 


(+1, 1), (+1, —-1). Multiplying ( a) if necessary by fe , which acts 


a 0 

Y Qt 
trivially on quadratic forms, we may assume that (a, y) is one of (0, 1), (1, 0), 
(1, +1). We treat these three cases separately. 

Case 1. (a, y) = (0, 1). The condition aé — By = 1 forces By = —1, and the 
formula b' = 2aaB + bad + bBy + 2cy6 gives (a’, b’, c’) = (c, —b + 2c6, *). 
Since |b| < c and |b — 2cdé| < c, we must have |6| < 1. If 6 = 0, we are 
led to (a’, b’, c') = (c, —b, a), which is reduced only if c = a, and this is the 
first of the two allowable exceptions. If |5| = 1, the triangle inequality gives 
2c = |2cd| < |b| + |2céd — b| < c+ c = 2c, and therefore |b| = c = |b — 2cd]. 
Then b = —(b — 2cd), and b = cd = +c. Since |b| < a < c, b = +a also. 
Hence (a’, b’, c’) = (a, —b, a), and this is again the first of the two allowable 
exceptions. 

Case 2. (a, y) = (1,0). The condition wd — By = 1 forces v6 = 1, and thus 
(a’, b’,c!) = (a,b + 2aB, *). Since |b| < a and |b + 2aB| < a, we must have 
|B| < 1. If 6B = 0, then (a’, b’, c’) = (a,b,c), and there is nothing to prove. 
If |8| = 1, the triangle inequality gives 2a = |2aB| < | — b| + |2aB + DI, and 


20 I. Transition to Modern Number Theory 


therefore |b] = a = |b+ 2Ba|. Then b = —(b + 28a), and we conclude that 
b = —aB = +a and b + 2Ba = =a. Hence the proper equivalence in question 
is of (a, a, c) to (a, —a, c), which is the second of the two allowable exceptions. 

Case 3. (a, y) = (1,+1). From (*) and the assumption that a > a’, we 
have a > a’ > alay| = a. Thus a = d’, and the definition of a’ shows that 
a=a+by+c. Hence c = —by,andc = |b|. Since |b] < a < c, we obtain 
—by =a=c. The formula b’ = 2aaB + bad + bBy + 2cy6 then simplifies to 
b! = 2aB + bd + bBy + 2ayd = (2a + by)(B + y5). From ad — By = 1, we 
have 6 — By = 1 and thus also yd = y + B. Therefore 6 + yd = 26 + y, and 
this cannot be 0. So |b'| > |2a + by| = |2a —a| =a =a’. Since (a’, b’, c’) 
is reduced, |b’'| = a’ = a =c = |b|, and the proper equivalence is of (a, a, a) 
to (a, —a, a). This is an instance of both allowable exceptions, and the proof is 
complete. 


EXAMPLES, CONTINUED. 


(2) D = —20. We saw earlier that the reduced positive definite forms with 
D = —20 are x? + 5y” and 2x? + 2xy + 3y’, ie., (1,0, 5) and (2, +2, 3). The 
remarks preceding Proposition 1.7 show that (2, 2, 3) is properly equivalent to 
(2, —2, 3), and the proposition shows that (1, 0, 5) is not properly equivalent to 
(2, 2, 3). (We saw this latter conclusion for this example earlier by considering 
residues.) Consequently h(—20) = 2. 


(3) D = —56. We saw earlier that the reduced positive definite forms with 
D = —S6are x*+14y?, 2x?+7y?, and 3x7+2xy+5y’, ie., (1, 0, 14), (2, 0,7), 
(3, 2,5), and (3, —2,5). Proposition 1.7 shows that no two of these four forms 
are properly equivalent. Consequently h(—56) = 4. 


Let us turn our attention to D > 0. We still have the proper equivalences 
of (a,b, a) to (a, —b, a) and (a, a,c) to (a, —a,c) as in the remarks before 
Proposition 1.7. But there can be others, and the question is subtle. Here are 
some simple examples. 


EXAMPLES WITH POSITIVE DISCRIMINANT. 
(1) D =5. The forms with D = 5 satisfying the inequalities |b] < |a| < |c| of 
Theorem |.6a are (1, +1, —1) and (—1, +1, 1). The second standard equivalence 


allows us to discard one form from each pair, and we are left with (1, 1, —1) and 


(—1,—1,1). The first of these two is equivalent to the second via (; : ) = 


( ey 0): Thus 4(5) = 1, as was announced without proof in Example 4 earlier in 


this section. 
(2) D = 13. The forms with D = 13 satisfying the inequalities |b| < |a| < 
|lc| of Theorem 1.6a are (1, +1, —3) and (—1,+1,3). The second standard 


3. Equivalence and Reduction of Quadratic Forms 21 


equivalence allows us to discard one form from each pair, and we are left with 
(1, 1, —3) and (—1, —1, 3). The first of these two is equivalent to the second via 


(55) =( 2): Thus (13) = 1. 


y 6 
(3) D = 21. The forms with D = 21 satisfying the inequalities |b| < |a| < 
|lc| of Theorem 1.6a are (1, +1, —5) and (—1,+1,5). The second standard 
equivalence allows us to discard one form from each pair, and we are left with 
(1, 1, —5) and (—1, —1, 5). These are not properly equivalent. In fact, the form 
—x? — xy + Sy’ is —1 for (x, y) = (1, 0), but x? + xy — Sy* = —1 is not even 
solvable modulo 3. Thus h(21) = 2. 


Although the starting data for these three examples are similar, the outcomes 
are strikingly different. The idea for what to do involves starting afresh with the 
reduction question that was addressed in Theorem 1.6a. For discriminant D > 0, 
a different reduction is to be used. The reduction in question appears in Theorem 
1.8a below, but some preliminary remarks are needed to explain the proof. 

Two forms (a, b, c) and (a’, b’, c’) of discriminant D > 0 will be said to be 
neighbors if c = a’ and b+ b' = 0 mod 2c. More precisely we say in this 
case that (a’, b’, c’) is a neighbor on the right of (a, b, c) and that (a, b, c) is 
a neighbor on the left of (a’, b’, c’). A key observation is that neighbors are 
properly equivalent to one another. In fact, if (a’, b’, c’) is a neighbor on the right 


of (a, b, c), define e a) = (3 a ys hea Then computation gives 


a y 2a Db a  p\. [2 b’ 
an) b 2)\y 8) Ve b-boy) 


The lower right entry of this matrix is an even integer, since b + b’ = 0 mod 2c 
and since, as a consequence, b + b’ = 0 mod 2. Hence (a, b, c) is transformed 
into (c, b’, c’), where c’ = S(b — b’) — 

Let us call a primitive form (a, b, c) of discriminant D > 0 reduced when it 
satisfies the conditions 


0<b<VD and VD—b<2\al<V/D+b. 


The first inequality shows that b is bounded if D is fixed, and the equality 
—4ac = D? — b? shows that there are only finitely many possibilities for a 
and c. Consequently there are only finitely many reduced forms for given D. 

From |b| < JD, we see that b? < D = b?—4acandac < 0; thus any reduced 
form has a and c of opposite sign. Then D — b* = —4ac = (2|a|)(2|c|), and it 
follows that 2|a| > /D —b implies 2|c| < JD +b and that 2|a| < /D+b 
implies 2|c| < D — b. Consequently 


J/D—b <2\c|</D+b. 


22 I. Transition to Modern Number Theory 


Theorem 1.8. Fix a positive nonsquare discriminant D. 


(a) Each form of discriminant D is properly equivalent to some reduced form 
of discriminant D. 

(b) Each reduced form of discriminant D is a neighbor on the left of one and 
only one reduced form of discriminant D and is a neighbor on the right of one 
and only one reduced form of discriminant D. 

(c) The reduced forms of discriminant D occur in uniquely determined cycles, 
each one of even length, such that each member of a cycle is an iterated neighbor 
on the right to all members of the cycle and consequently is properly equivalent 
to all other members of the cycle. 

(d) Two reduced forms of discriminant D are properly equivalent if and only 
if they lie in the same cycle in the sense of (c). 


REMARKS. Conclusion (d) is the deepest part of the theorem, involving a subtle 
argument that in essence uses the periodic continued-fraction expansion of the 
roots z of the polynomial az? + bz +c if (a, b, c) is a form under consideration. 
We shall prove (a) through (c), omitting the proof of (d), and then we shall return 
to the three examples D = 5, 13, 29 begun just above. 


PROOF OF THEOREM 1.8a. If (a, b, c) is given and is not reduced, let m be the 
unique integer such that 


JD —2\c| < —b + 2cm < VD, (x) 
and define (a’, b’, c!) = (c, —b + 2cm, a — bm + cm”). Then 
b? — 4a'c! = (—b + 2cm)* — 4c(a — bm + cm”) 


= b* — 4bem + 4c°m? — 4ac + 4bcem — 4c*m = b* — 4ac = D, 


and we observe that a’ = c and that b + b’ = 2cm = 0 mod 2c. Consequently 
(a’, b', c’) is a form of discriminant D and is a right neighbor to (a, b, c). By the 
remarks before the theorem, (a, b, c) is properly equivalent to (a’, b’, c’). 

We repeat this process at least once, obtaining (a”, b”, c’). If |a”| < |a’|, we 
repeat it again, obtaining (a’”, b’”, c’”), and we continue in this way. Eventually 
the strict decrease of the magnitude of the first entry must stop. To keep the 
notation simple, we may assume without loss of generality that |a”| > |a’|. The 
claim is that (a’, b’, c’) is then reduced. 

Put u = VD — b’ and v = b! — (\/D — 2|a'|). The inequalities («) show that 
u > Oandv > 0. Therefore 


0 <v*+2uv+2u/D = (u+v)? —w +2u/D 
= 4q — (D — 2b'/D +b”) +2D — 2b'/D 
= 4a" 1 p=)? =4a? —4a'c: 


3. Equivalence and Reduction of Quadratic Forms 23 


Since |c’| = |a”| > |a’|, this inequality shows that a’c’ < 0. Therefore b’? = 
D + 4a'c' < D, and |b'| < VD. 
From a’c’ < 0 and |a’| < |c’|, we see that 4Ja'|? < 4|a’c’| = —4a'c! = 


D — b? < D. Therefore 2|a’| < VD. The inequality D — 2|c| < b’ implies 
that D — b’ < 2\c| = 2\a’|. The right side has just been shown to be < JD, 
and therefore b’ > 0. Hence /D —b’ < 2\a'| < /D < J/D+0’. 


PROOF OF THEOREM 1.8b. Suppose that (a, b, c) is reduced and that (a’, b’, c’) 
is a reduced neighbor on the right of (a, b,c). Then we must have a’ = c and 
b+b' =0 mod 2c. Since D —b’ < 2\a'| and b! < JD, we have JD — 2\a’| < 
b! < JD. That is, D — 2|c| < b! < JD. These inequalities in combination 
with the congruence b + b’ = 0 mod 2c show that (a, b, c) uniquely determines 
b’. Since (a’, b’, c') is to have discriminant D, c’ is uniquely determined also. 

We turn this construction around to prove existence of aright neighbor. Define 
(a’, b', c') in terms of (a, b, c) as in the proof of Theorem 1.8a. Then a’ = c, and 
b’ is the unique integer such that b + b’ = 0 mod 2c and 


JD —2\c| <b! < VD. 


The form (a’, b’,c’) is a right neighbor of (a, b,c), and we are to show that 
(a’, b’, c’) is reduced. 

Since (a, b, c) is reduced, we have J/D—b< 2\|c| < /D+bandb < JD. 
Let m be the integer such that b + b' = 2m|c|. Addition of the inequalities 
b! — (/D — 2\cl) > 0 and /D + b — 2I\c| > 0 gives 2m|c| = b+ b’ > 0, 
and thus m > 0. Hence m — 1 > 0. Addition of the inequalities /D — b > 0 
and b! — (./D — 2\c|) > 0 gives 0 < b’ — b+ 2Ic| = 2b’ — (b+ dD’) + lel = 
2b’ — 2(m — 1)|c|. Hence 2b’ > 2(m — 1)|c| => 0, and we see that b’ > 0. 
Therefore 0 < b’ < JD. 

The definition of b’ gives /D—b’ < 2|c| = 2|a’|. Addition of the inequalities 
2(m — 1)|c| > 0 and VD — b > 0 gives b +b’ — 2|c| + VD — b > 0, which 
says that 2|a’| < /D +b’. Therefore (a’, b’, c’) is reduced. 

Let R be the operation of passing from a reduced form (a, b, c) to its unique 
reduced right neighbor (a’, b’, c’). What we have just shown implies that R acts 
as a permutation of the finite set of reduced forms of discriminant D. This set 
being finite, let n be the order of R. Then the set {Rk |0 < k <n—l}isa 
cyclic group of permutations of the set of reduced forms of discriminant D. The 
existence of a two-sided inverse of R as a permutation implies that each reduced 
form of discriminant D has exactly one left neighbor. Thus the existence and 
uniqueness of neighbors on one side for reduced forms, in the presence of the 
finiteness of the set, implies existence and uniqueness on the other side. 


24 I. Transition to Modern Number Theory 


PROOF OF THEOREM 1.8c. We continue with R as the operation of passing from 
a reduced form to its unique reduced right neighbor, letting {R* |0 < k <n—1} 
be the finite cyclic group of powers of R. This group acts on the set of reduced 
forms of discriminant D, and the cycles in question are the orbits under this action. 
To see that each orbit has an even number of members, we recall that a reduced 
form (a, b,c) has a and c of opposite sign. Thus if, for example, a is positive, 
then R!(a, b,c) = (a’, b’, c’) has (—1)/a’ positive. If the orbit of (a, b, c) has 
k members, then R*(a, b, c) = (a, b,c). Consequently (—1)*a has to have the 
same sign as a, and k has to be even. Finally the members of each orbit are 
properly equivalent to one another because, as we observed before the statement 
of the theorem, a form is properly equivalent to each of its neighbors. 


EXAMPLES WITH POSITIVE DISCRIMINANT, CONTINUED. 

(1) D = 5. The forms with D = 5 satisfying the inequalities of Theorem 
1.8a are (1, 1, —1) and (—1, 1, 1), and these consequently represent all proper 
equivalence classes. They form a single cycle and are properly equivalent by 
Theorem 1.8c. Thus again we obtain the easy conclusion that h(5) = 1. 

(2) D = 13. The forms with D = 13 satisfying the inequalities of Theorem 
1.8aare (1, 3, —1) and (—1, 3, 1), which make up a single cycle. Thus h(13) = 1. 

(3) D = 21. The forms with D = 21 satisfying the inequalities of Theorem 
1.8a are (1, 3, —2) and (—2, 3, 1), which make up one cycle, and (—1, 3, 2) and 
(2, 3, —1), which make up another cycle. Thus h(21) = 2. 


4. Composition of Forms, Class Group 


The identity (x? + y?) (x3 +y3) = (x1X2 yiy2)? +(x, y244 x2y1)*, which can be 
derived by factoring the left side in Q(./—1)[x1, y1, x2, y2] and rearranging the 
factors, readily generalizes to an identity involving any form x” + bxy + cy” of 
nonsquare discriminant D = b* — 4c. We complete the square, writing the form 
as (x — Sby)? = i * DD and factoring it as (x = Sby+ tyVD) (x = sby = tyVD), 
and we obtain 
(xp + bxiyi + cy7) xg + bxry2 + cy3) 
= (1x2 — cyty2)? + b(xix2 — cyiyo)(x1y2 + x2y1 + byiy2) 
+ c(xiy2 + xoy1 + by yo)’. 

Improving on an earlier attempt by Legendre, Gauss made a thorough inves- 
tigation of how one might multiply two distinct forms of the same nonsquare 
discriminant, not necessarily with first coefficient 1, and Dirichlet reworked the 
theory and simplified it. Out of this work comes the following composition 
formula, of which the above formula is manifestly a special case. 


4. Composition of Forms, Class Group 25 


Proposition 1.9. Let (a,, b, c,) and (a, b, cz) be two primitive forms with the 
same middle coefficient b and with the same nonsquare discriminant D, hence 
with a,c, = aocz # 0. Suppose that j = cid; ! = ca; ' is an integer. Then the 
form (aa, b, j) is primitive of discriminant D, and it has the property that 


(aixt + bxiy1 + cyz)(arxz + bx2y2 + cy3) 
= ayan(x1x2 — jyry2)? + b(x1x2 — jyry2)(aixiy2 + arx2y1 + byt yr) 


+ j(axiy2 +ax2y1 + by yo)’. 


REMARKS. Consequently if an integer m is represented by the form (a, b, c1) 
and an integer n is represented by the form (az, b, cz), then mn is represented by 
the form (aa, b, j). For example we saw in an example with D = —20 imme- 
diately following the statement of Theorem 1.6 that any prime that is congruent to 
3 or 7 modulo 20 is representable as 2x” + 2xy +3y7. If we have two such primes 
p and q, then p is representable by (2, 2, 3) and q is representable by (3, 2, 2). 
The proposition is applicable with j = 1 and shows that pq is representable by 


: a = & ie changes this form to the 


properly equivalent form (5, 0, 1). Thus pq is representable as x* + 5y?. 


(6, 2, 1). In turn, substitution using ( 


PROOF. The form (aia2, b, j) is primitive because any prime that divides 
GCD(a,a2, b, j) has to divide either GCD(a,, b, j) or GCD(qa, b, j) and then 
certainly has to divide GCD(q,, b, c;) or GCD(az, b, cz). No such prime ex- 
ists, and hence (a;a2, b, j) is primitive. The discriminant of (a,a2, b, j) is 
b? — 4jajan = D+ 4a,c, — 4jaja. = D + 4a,c, — A(cjay')ayan = D, 
as asserted, and the verification of the displayed identity is a routine computation. 


Let us say that two primitive forms (a1, b1, c1) and (ao, b2, c2) of the same 
nonsquare discriminant are aligned if b; = bz and if j = cya, — Coa; is an 
integer. In the presence of equal nonsquare discriminants D and the equal middle 
entries b, the rational number j is automatically an integer if GCD(a1, a2) = 1. 
In fact, the equality D — b? = —4a;c, = —4ac2 shows that D — b? is divisible 
by 4a, and by 4a2; since GCD(a1, a2) = 1, D— b? is divisible by 4a1a2, and the 
quotient — j is an integer. 

The idea is that each pair of classes of properly equivalent primitive forms 
of discriminant D has a pair of aligned representatives, and a multiplication of 
proper equivalence classes is well defined if the product is defined as the class of 
the composition of these aligned representatives in the sense of Proposition 1.9. 
This multiplication for proper equivalence classes will make the set of classes 
into a finite abelian group. This group will be defined as the “form class group” 
for the discriminant D, except that we use only the positive definite classes in the 


26 I. Transition to Modern Number Theory 


case that D < 0. Before phrasing these statements as a theorem, we make some 
remarks and then state and prove two lemmas. 

Let (a, b, c) be a form of nonsquare discriminant D, and let b’ be an integer 
with b' = b mod 2a. In this case the number c’ = (b'* — D)/(4a) is an integer; 
in fact, we certainly have the congruences b’* = b” mod 2a and b’? = b? mod 4, 
and thus we obtain the automatic’ consequence b’2 = b? mod 4a, the rewritten 
congruence b’? = D + 4ac mod 4a, and the desired result b’* — D = 0 mod 4a. 
Hence (a, b’, c’) is another form of discriminant D. We call (a, b’, c’) a translate 
of (a, b, c). The key observation about translates is that the translate (a, b’, c’) is 
properly equivalent to (a, b, c). This fact follows from the computation 


1 0 2a b 1 ae Oe 2a b+ 2al fp 2a 2b! 
11 b 2c)}\0 1) \b4+2al 2al?+bi+c))” \b' 2}? 


valid for any integer /. 


Lemma 1.10. If (a, 5, c) is a primitive form of nonsquare discriminant and if 
m # 0 is an integer, then (a, b, c) primitively represents some integer relatively 
prime to m. 


PROOF. Let 


wo = product of all primes dividing a, c, and m, 
Xo = product of all primes dividing a and m but not c, 


yo = product of all primes dividing m but not a. 


Referring to the definitions, we see that any prime dividing m divides exactly 
one of wo, Xo, and yo. In particular, GCD(xo0, yo) = 1. We shall show that 
GCD(m, axi + bxoyo + cyg) = 1, and the proof will be complete. Arguing by 
contradiction, suppose that a prime p divides GCD(m, ax? +bxoyo+e ya). There 
are three cases for p, as follows. 

Case 1. If p divides xo, then the fact that p divides axe + bxoyo + cyG implies 
that p divides aye. Since p does not divide yo, p divides c, in contradiction to 
the definition of xo. 

Case 2. If p divides yo, then similarly p divides Axe. Since p does not divide 
Xo, p divides a, in contradiction to the definition of yo. 

Case 3. If p divides wo, then the fact that p divides a and c implies that p 
divides bxoyo. Since p divides neither xo nor yo, p divides b, in contradiction to 
the fact that (a, b, c) is primitive. 


7The argument being used here—that a congruence modulo 2a implies the congruence of the 
squares modulo 4a — will be used again later in this section without detailed comment. 


4. Composition of Forms, Class Group 27 


Lemma 1.11. Suppose that (a;, b, cy) and (ap, b, cz) are properly equivalent 
forms of nonsquare discriminant. If / is an integer such that GCD(q@, a2, /) = 1 
and such that J divides GCD(c}, cz), then (Ja), b, J~!c,) and (lay, b, I~ 'c) are 
properly equivalent forms. 


REMARK. Even if (a1, b, ci) and (a2, b, c2) are primitive, it does not follow 
that (la;, b, 1~'c,) and (lay, b, 1~'c2) are primitive. In fact, one need only take 
1 = 2 and (a,b, c1) = (a, b, c2) = C1, 2, 4). 


PROOF. Since (a), b, c,) and (a, b, cr) are properly equivalent, there exists 


(55) with 
a y 2a, bD a B\ (2a b 
Pw b 2)\y 8) \ db 2X, )° 


-1 
We multiply both sides on the right by e ) , and the result is the system of 
four scalar equations 


2aja + by = 2a26 — by, 
2a;6B + bd = bb — 2ery, 
ba + 2cjy = —2a2B + ba, 
bB + 2c16 = —bB + 2c2a. 


The second and third equations simplify to a,B + coy =0andap +cyy =0. 
Since / divides c; and c2, these two simplified equations show that / divides a; B 
and a2B. Since GCD(a1, a, 1) = 1, it follows that / divides 6. 


a [7! 


Therefore the matrix ihn 


P ) of determinant 1 has integer entries. Direct 
computation shows that 


a ly)\ (2la, b a I'B\  (2la b 
iB 6 By Oe PNA be > NOB 2 tee} 


Consequently the forms (a, b, [7 : c,) and (lag, b, I~ : c2) are properly equivalent. 


Theorem 1.12. Let D be anonsquare discriminant, and let C; and C2 be proper 
equivalence classes of primitive forms of discriminant D. 


(a) There exist aligned forms (a1, b, c1) € C; and (a2, b, cr) € Co, and these 


may be chosen in such a way that a; and az are relatively prime to each other and 
to any integer m # 0 given in advance. 


28 I. Transition to Modern Number Theory 


(b) If the product of C; and C2 is defined to be the proper equivalence class 
of the composition of any aligned representatives of C, and C2, as for example 
the ones in (a), then the resulting product operation is well defined on proper 
equivalence classes of primitive forms of discriminant D. 

(c) Under the product operation in (b), the set of proper equivalence classes 
of primitive forms of discriminant D is a finite abelian group. The identity is the 
class of (1,0, —D/4) if D = 0 mod 4 and is the class of (1, 1, —(D — 1)/4) if 
D = 1 mod 4. The group inverse of the class of (a, b, c) is the class of (a, —b, c). 


REMARK. When D < 0, the proper equivalence classes of positive definite 
forms are a subgroup. In fact, if (a1, b,c1) and (az, b, cz) are positive definite 
and are aligned, then a; and ap are positive, and therefore their composition 
(a,a2, b, j) has a,az positive and is positive definite. As was indicated in the 
discussion before Lemma 1.10, the form class group for discriminant D is defined 
to be the group in (c) if D > 0, and it is defined to be the subgroup of classes of 
positive definite forms if D < 0. 


PROOF OF THEOREM 1.12a. By two applications of Lemma 1.10, C; primitively 
represents some integer a; prime to m, and C) primitively represents some integer 
ay prime to aym. Arguing as in the last part of the proof of Theorem 1.6b, we may 
assume without loss of generality that (x, y) = (1, 0) yields these values in each 
case. Then C; contains a form (a1, b;, *) for some b;, and C2 contains a form 
(a2, b2, *) for some bz. By the remarks before Lemma 1.10, C; contains every 
translate (a1, bj + 2a,/,, *), and C2 contains every translate (a2, bz + 2azlo, *). 

Let us make specific choices of /; and /2. We know that b} = D = b2 mod 2, 
so that by — b, is even. The construction of a, and az was arranged to make 
GCD(a,, a2) = 1, and therefore GCD(2a,, 2a2) = 2. Since by — by, is even, 
we can choose /; and /y such that 2a,/; — 2a.l, = bp — by. Then b; + 2a,l, = 
by + 2agl2, and we take the common value as b. 

For this b, C, contains the form (a, b, *), and C2 contains the form (ap, b, *). 
Since we have arranged that GCD(a1, a2) = 1, the remark immediately following 
the definition of “aligned” shows that these forms are aligned. 


PROOF OF THEOREM 1.12b. Suppose that 
(a,,b’, *) is properly equivalent to (aj, b”, *), 
(a, b’, *) is properly equivalent to (aj, b”, *), 
with the vertical pairs aligned. We are to show that 
(a,a}, b’,*) is properly equivalent to (ajay, b", *). (*) 


Theorem 1.12a applied to the integer m = aja\a'a‘ gives us an aligned pair of 


forms (a1, b, *) and (ao, b, *) in the respective proper equivalence classes such 


4. Composition of Forms, Class Group 29 
that GCD(q), a2) = 1 and GCD(a a2, m) = 1. If we can show that 


(a\a}, b',*) is properly equivalent to (aja, b, *), 


(*) 
then we will have symmetrically that 


(ajay, b",*) is properly equivalent to (a,a2, b, *), 
and (+) will follow from this fact and («*) by transitivity of proper equivalence. 
We can now argue as in the proof of Theorem 1.12a. We know that b = D = 
b’ mod 2, so that b’ — b is even. The construction of a; and a2 was arranged 
to make GCD(a,a2, aa) = 1, and therefore GCD(2a a2, 2a,a) = 2. Since 
by — b; is even, we can choose / and /’ such that 2a;a)/ — 2a)a4l' = b' — b. Then 
b + 2a,ayl = b’ + 2a,a}l’, and we take the common value as B. This B has 
B=b mod 2a,a) 


and = B=bD' mod 2aja}. 


Thus 


(a1, b, *) 
(a2, b, *) 
(a2, b, *) 
and similarly 
(a,, b’, *) 
(a, b’, *) 


(a\a4, b’, *) 


is properly equivalent to 
is properly equivalent to 
is properly equivalent to 


is properly equivalent to 
is properly equivalent to 


is properly equivalent to 


(a, B, *), 
(a2, B, *), 


(aja, B, *), 


(1) 


(ai, B, *), 
(a, B, *), 


(a,a}, B, *). 


(+7) 


By construction of b, (a1, b, *) is properly equivalent to (aj, b’, *). This equiv- 
alence, in combination with the first line of (+) and the first line of (+7), shows 
that 

(a;, B, x) is properly equivalent to (a), B, x). 


(4) 


Let us check that Lemma 1.11 is applicable to the two properly equivalent 
forms of (+) and to the integer / = a4. In fact, GCD(a), a2,/) = 1 follows 
from GCD(a,a2, aa,) = 1, and the problem is to show that ] = a} divides 
(D — B?)/(4a,) and (D — B?)/(4a‘). To see this divisibility, we observe that 
D — b’ is divisible by 4a\a, because (a), b’, *) and (a), b’, *) are given as 
aligned; the congruence b’ = B mod 2a‘a‘, implies that b’? = B* mod 4a‘ a‘, 
and addition gives D — B* = 0 mod 4aja5. Meanwhile, D — B? is divisible 
by 4a, because the third member of (a1, B, *) is an integer. Since D — B? is 
divisible also by 4a‘a‘, and since GCD(a), aa) = 1, D — B? is divisible by 


30 I. Transition to Modern Number Theory 


4a,a,a,. Therefore (D — B*)/(4a,) and (D — B’)/(4a\) are divisible by a‘, and 
Lemma 1.11 is indeed applicable. 
The application of Lemma 1.11 to (¢) with / = a‘, shows that 


(a\a}, B,*) is properly equivalent to (aja), B, x). 


Similarly (a2, B, *) is properly equivalent to (a4, B, *), and an application of 
Lemma 1.11 to this equivalence with / = a; shows that 


(a1a2, B, *) is properly equivalent to (aia), B, x). 
The two results together show that 
(a\a2, B, x) is properly equivalent to (aja4, B, *). 


Combining this equivalence with the third line of (+) and the third line of (++), 
we obtain (+), and the proof of (b) is complete. 


PROOF OF THEOREM 1.12c. The set of proper equivalence classes is finite by 
Theorem 1.6a, and commutativity of multiplication is clear. Define 5 to be 0 if 
D = 0 mod 4 and to be 1 if D = 1 mod 4. Let us see that the class of (1, 6, *) 
is the identity. If (a, b,c) has discriminant D, then b = 6 mod 2, and hence 
(,b,*) = 1,64+2-1- $(b — 6)) is a translate of (1,5, *). Consequently 
(1, b, *) and (1, 6, *) are properly equivalent. Since Proposition 1.9 shows that 
the composition of (a, b, c) and (1, b, *) is (a, b, *), Theorem 1.12b allows us to 
conclude that the class of (1, 4, *) is the identity. 

For inverses Theorem 1.12b shows that the product of the classes of (a, b, c) 
and (a, —b,c) is the product of the classes of (a, b, c) and (c, b, a), which is 
the class of the composition (a, b, c)(c, b, a). Proposition 1.9 shows that this 
composition is (ac, b, 1). Since (ac, b, 1) is properly equivalent to (1, —b, ac) 
and since the latter is properly equivalent to (1, 5, «), the class of the composition 
(a, b, c)(c, b, a) is the identity. 

To complete the proof, we need to verify associativity. Let Cj, C2, and C3 
be three proper equivalence classes of primitive forms of discriminant D. Let 
(a1, b}, c1) be a form in the class C;. Lemma 1.10 shows that C2 represents an 
integer a2 prime to a, and then it follows that the form (az, b2, cz) isin C2 for some 
integers bz and cy. A second application of Lemma 1.10 shows that C3 represents 
an integer a3 prime to a) a», and then it follows that the form (a3, b3, c3) is in C3 for 
some integers b3 and c3. The middle components have b; = b) = b3 = 6 mod 2, 
and thus 5 (bj — 6) is an integer for j = 1, 2,3. Since aj, a», a3 are relatively 
prime in pairs, the Chinese Remainder Theorem shows that the congruences 
x= 5 (bj — 6) mod a; have a common integer solution x for j = 1, 2, 3. Define 


5. Genera 31 


b = 2x + 6. Then b is a solution of b = b; mod 2a; for j = 1,2,3. Write 
b = b; + 2aj;n; for suitable integers nj. Then (a;, b, *) = (a;, bj + 2ajnj, *) 
is a translate of (a;, bj, cj) and consequently is properly equivalent to it. Thus 
(a;, b, *) lies in C;. Taking into account Theorem 1.12b and using Proposition 1.9, 
we see that C; (C2C3) and (CC2)C3 are both represented by the form (a)a2a3, b, *) 
and hence are equal. 


5. Genera 


The theory of genera lumps proper equivalence classes of forms of a given dis- 
criminant according to their values in some way. There are at least two possible 
definitions of “genus,” and it is a deep result that they lead to the same thing 
in all cases of interest. By way of background, we saw in Sections 2 and 3 for 
discriminant D = —56 that the number of proper equivalence classes of binary 
quadratic forms is exactly 4, representatives being x7 + 14y*, 2x? + 7y7, and 
3x?+2xy+5y7. The last two are improperly equivalent and take the same values 
at integer points (x, y), and there are no other improper equivalences. Thus the 
first two take on a disjoint set of prime values from the values of 3x? +2xy +5y? 
for integer points (x, y), and the sets of prime values taken on by x” + 14y? and 
2x? + 7y? at integer points are disjoint from one another. 

Two possible lumpings of proper equivalence classes arise for this discriminant. 
One is to identify forms when their values modulo 56 include the same residues 
prime to 56. It is just a finite computation to see that 


xo 4 14y? and 2x? + Ty” take on the residues 1,9, 15, 23, 25, 39, 
3x7 + 2xy + 5y" take on the residues 3,5, 13, 19, 27, 45. 


Thus the first kind of lumping treats x7 + 14y? and 2x” + 7y” together because 
of the residues they take on, and it treats 3x7 + 2xy + Sy” and 3x” — 2xy + 5y? 
together. Gauss proceeded by using this kind of lumping to define “genus.” 

The other lumping is to identify integer forms that take on the same rational 
values at rational points. Here Oye ae a = 1 for (x, y) = G, 3)» and of course 
x* + 14y? = 1 for (x, y) = (1,0). Hence the sets of values of x? + 14y? 
and 2x* + 7y? for x and y rational have a nonzero value in common. Lemma 
1.13 below implies that the sets of rational values taken on by the two forms are 
identical. The second kind of lumping treats x? + 14y? and 2x? + 7y* together 
because they take on the same rational values. We shall use this latter kind of 
lumping because, as Theorem 1.14 below shows, this is the definition that more 
quickly identifies the genus group once the form class group is known. 


32 I. Transition to Modern Number Theory 


Problems 25-40 at the end of the chapter show that the two definitions of genus 
lead to the same thing for discriminants that are “fundamental” in a sense that we 
define in a moment. 

We have defined two forms (a, b, c) and (a’, b’, c’) with integer entries to be 


13 a raed : : a B . ‘e 
properly equivalent” if there is a matrix C ! ) in SL(2, Z) with 


a sy 2a b OP \ ff 2a <b 
B 54 b 2W)\y SJ) VB 2!) 


We say that two forms (a, b, c) and (a’, b’, c’) with rational entries are properly 
equivalent over Q if there is a matrix ( . f in SLQ, Q) such that the displayed 


equality holds. For emphasis we can refer to the original notion as “proper 
equivalence over Z” when it is advisable to be more specific. It is evident that if 
two forms with rational entries are properly equivalent over Q, then their sets of 
values at points (x, y) in Q x Q are the same. 


Lemma 1.13. If (a, b,c) is a form with rational coefficients and with non- 
square discriminant D that takes on a nonzero value g € Q for some (Xo, yo) 
in Q x Q, then (a,b,c) is properly equivalent over Q to (q,0, —D/(4q)). 
Consequently two forms over Q of the same discriminant that take on a nonzero 
value in common over Q are properly equivalent over Q. 

PROOF. Suppose that axe + bxoyo + eye =q. Put ‘ = ia Since xo 
and yo cannot both be 0, we can choose rationals 6 and 6 such that a6 — By = 1. 


a Bp Y : a Bp 1\ _ (%*0 5 
Then i ) has determinant | and satisfies c :) ( i) — ( a The equality 


axé + bxoyo +cy = 5 (xo yo) ce ,) ce therefore yields 


1-14 05) YE DO) 


It follows that (a, b,c) is properly equivalent over Q to some form (q, b’, c’) 
with D’ and c’ rational. Using a translation with a rational parameter, we see that 
(q, b’, c’) is properly equivalent over Q to a form (q,0, *). Inspection of the 
discriminant shows that this last form must be (¢, 0, —D/(4q)). 


Two primitive integer forms having the same discriminant are said to be in 
the same genus (plural: genera) if they are properly equivalent over Q. In view 
of Lemma 1.13 the condition is that they are primitive and take on a common 
nonzero value over Q, or equivalently that they are primitive and take on the same 
set of values over Q. Thus x? + 14y? and 2x? + 7y* furnish an example of two 


5. Genera 33 


forms in distinct classes that are in the same genus. Two primitive integer forms 
that are in the same proper equivalence class over Z are in the same genus. The 
genus of the class C will be denoted by [C]. The identity class will be denoted 
by €, and P = [€] is called the principal genus. If (a, b,c) is an integer form 
representing a class C, then Theorem 1.12c shows that (a, —b, c) represents Cy 
On the other hand, C and C~! take on the same values over Z, as we see by 
replacing (x, y) by (x, —y), and it follows that [C] = [C7']. 

For the main theorem about genera, we shall introduce an extra hypothesis on 
the discriminant D. A nonsquare integer D will be said to be a fundamental 
discriminant if D is not divisible by the square of any odd prime and if when 
D is even, D/4 is congruent to 2 or 3 modulo 4. It will be seen later that this 
condition is equivalent to the requirement that D be the “field discriminant” of 
some quadratic number field. Examples of discriminants that are not fundamental 
are D = —12, —44, —108. 

With this condition imposed on D, any integer form (a, b, c) of discriminant D 
is automatically primitive. In fact, no odd prime p can divide GCD(a, b, c), since 
then p* would divide D. If 2 were to divide GCD(a, b, c), then (a/2, b/2, c/2) 
would be an integer form, and D/4 = (b/2)? — 4(a/2)(c/2) would be an integer 
congruent to | or 4 modulo 4. 


Theorem 1.14. For a fundamental discriminant D, the principal genus P of 
primitive integer forms® is a subgroup of the form class group H, and the cosets 
of P are the various genera. Thus the set G of genera is exactly the set of cosets 
H/P and inherits a group structure from class multiplication. The subgroup P 
coincides with the subgroup of squares in H, and consequently every nontrivial 
element of G has order 2. 


REMARKS. The group G is called the genus group of discriminant D. The 
hypothesis that D is fundamental is needed only for the conclusion that every 
member of P is a square in H. Since every nontrivial element of G has order 
2 when D is fundamental, application of the Fundamental Theorem of Finitely 
Generated Abelian Groups or use of vector-space theory over a 2-element field 
shows that G is the direct sum of cyclic groups of order 2; in particular, the order 
of G is a power of 2. Problems 25-29 at the end of the chapter show that the 
order of G is 2°, where g + 1 is the number of distinct prime factors of D. 


Proor. Let V(C) denote the set of Q values assumed by forms in the class C 
at points (x, y) inQ x Q If S and S” are two genera and if C is a class in S and 
C’ is aclass in S’, we define S - S’ = [CC’]. 


8 As usual, we exclude the negative definite classes in the discussion. 


34 I. Transition to Modern Number Theory 


To see that this product operation is well defined on the set G of genera, let C” 
be in S’ also. Then V(C’) = V(C”). If g isin V(C) and q’ is in V(C’) = V(C"), 
then the prescription for multiplying classes shows that qq’ is in V(CC’) and 
V(CC"). Hence V(CC’) = V(CC"), and [CC’] = [CC”]. Therefore multiplication 
of genera is well defined. Define a function g : H > G by g(C) = [C]. Then 
the computation 


g(CC’) = [CC] = [CIC] = 9(C)g(C) 


shows that g is a homomorphism of H onto G. The kernel of g is [C] = P, which 
is therefore a subgroup, and the image of g, which is the set G of genera with its 
product operation, has to be a group. 

For any class C, the equality [C] = emme| implies that [C?] = [C][C] = 
[Cc ][C~'] = [CC~'] = [E] = P. Hence P contains all squares. Conversely let C 
be in P. Then C takes on the value 1 over Q. If (a, b, c) is a form in the class C, 
then there exist rationals r and s with ar? + brs + cs? = 1. Clearing fractions, 
we see that there exist integers x and y such that ax* + bxy + cy? = n? for some 
integer n # 0. Without loss of generality, we may assume that n is positive. 
Since (a, b, c) is primitive, a familiar argument allows us to make a substitution 
for which the value n? is taken on at (x, y) = (1,0). In other words, (a, b,c) 
is properly equivalent over Z to a form (n, b’, c’) for suitable integers b’ and 
c’. The composition formula in Proposition 1.9 shows that the composition of 
(n, b’, c'n) with itself is (n”, b’, c’), and hence C is exhibited as the square of the 
class of (n, b’, c’n). Since (n, b’, c'n) has the same discriminant D as (n?, b’, c’) 
and therefore as (a, b,c) and since D is fundamental, (n, b’, c'n) is primitive. 
Therefore C is the square of a class of primitive forms. If C is positive definite, 
then the above choice of the sign of n as positive makes (n, b’, c’n) positive 
definite. Hence the class of (n, b’, c’n) is in H. 


EXAMPLE. The discriminant D = —56 is fundamental, and we have seen that 
the form class group is of order 4 with representatives x? + 14y?, 2x? + 7y7, and 
3x* + 2xy +5y?. We have seen also that x7 + 14y? and 2x? + 7y? both lie in the 
principal genus P. A group of order 4 must be isomorphic to the cyclic group 
C4 or to C2 x C2. In the first case the subgroup of squares has order 2, and in the 
second case the subgroup of squares has order 1. Since we have already found 
two elements in P, P has order exactly 2. By the theorem we must be in the first 
case. Hence H is of type C4, and the genus group G is of type C2. It is possible to 
check directly that 3x? + 2xy + 5y* has order 4 by making computations similar 
to those for Problem 4d at the end of the chapter. 


6. Quadratic Number Fields and Their Units 35 
6. Quadratic Number Fields and Their Units 


In this section we review material about quadratic number fields that appears in 
various places in Basic Algebra, and we determine the units in the ring of integers 
of such a number field. 

Quadratic number fields are extension fields K of Q with [K : Q] = 2. Sucha 
field is necessarily of the form K = Q(./m ), where m is a uniquely determined 
square-free integer not equal to 0 or 1. The set {1, ./m } is a vector-space basis 
of K over Q. 

The extension K /Q is a Galois extension, and the Galois group Gal(K /Q) 
of automorphisms of K fixing Q has two elements. We denote the nontrivial 
element of the Galois group by o; its values on the members of the vector-space 
basis are o(1) = 1 ando(./m) = —./m. 

The norm N = Nx/g and trace Tr = Trx/g are given by N(a) = a-o (a) and 
Tr(a) = a +o(a). Thus N(a + b./m) = a? — mb’ and Tr(a + b./m) = 2a. 
These values are members of Q. The norm is multiplicative in the sense that 
N(a@B) = N(a)N(f), and N(1) = 1. 

The ring R of algebraic integers in K is the integral closure of Z in K. It works 
out to be 

ZLJ/m | if m = 2 or 3 mod 4, 

7 | Z[4(./m—1)] ifm =1mod4 

and is therefore a free abelian group of rank 2. The automorphism o carries R to 
itself. The norm and trace of any member of R are in Z; conversely any member 
of K whose norm and trace are in Z is in R. We define the algebraic integer 6 to 
be given by 

—,/m if m = 2 or 3 mod 4, 
eS | t1—J/m) ifm =1mod4. 


Then {1, 5} is a Z basis of R. The norm and trace of 6 are given by 


—m if m = 2 or 3 mod 4, 


i(l—m) — ifm=1mod4, 


NG) = 5-008) =| 


Tr(8) = 8 +.0(3) 0 if m = 2 or 3 mod 4, 
T =—= oO = 
1 ifm = 1 mod 4. 


There is a general notion of field discriminant D, or absolute discriminant, 
for an algebraic number field, whose definition will be given in Chapter V. We 
shall not give that definition in general now but will be content to give the formula 
for D in the quadratic number field Q(./m ), namely 


5 4m if m = 2 or 3 mod 4, 
lm ifm = 1 mod 4. 


36 I. Transition to Modern Number Theory 


The units of K are understood to be the members of the group R* of units in 
the ring R. These are the members ¢ of R with N(e) = +1. In fact, if ¢ is a unit, 
then the equality ee~! = 1 implies that 1 = N(1) = N(ee!) = N(e)N(e7!) 
and shows that N(¢) is aunitin Z. Thus V(e) = +1. Conversely if N(e) = +1, 
then teo(e) = 1 shows that o(¢) = +e7!; since a(e) is in R, € is exhibited as 
in R* and is therefore a unit. 

For m < 0, the units of Q(,/m ) are easily determined. In fact, if e = a + béd 
with a and b in Z, then N(e) = (a + bd)(a + bo (5)) = a*? + bTrh6 +b? N (5) 
with each term equal to an integer and with the end terms > 0. Sorting out the 
possibilities, we see that 


{+1,4/7-1} ifm = —1, 
RX =} {41,3414 V-3)} ifm =-3, 
= 1} for all other m < 0. 


The respective orders of R* are 4, 6, and 2. 
Determination of the units when m > 0 is more delicate. We require a lemma. 


Lemma 1.15. If @ is a real irrational number and if N > 0 is an integer, then 
there exist integers A and B with 


1 
[Bor Als 55 and O<B<N. 


For this A and this B, 


A 1 
le-=|<— 

PROOF. Put a, = na — [na], where [- ] denotes the greatest-integer function. 
Then 0 < a, < 1. We partition the half-open interval [0, 1) into N subintervals 
[S, a) with | <t < N. ForO <n < N, the expression a, takes on N + 1 
distinct values because a, = a@,, would imply that (n — m)a is in Z. Hence 


there exist a, and a, with n > m that lie in the same subinterval [S, +). 
Then |a;, — Qm| < x If we take B = n —m and A = [na] — [ma], then 


|Ba — A| = |@n — @|, and the inequality |Ba — A| < x follows. Dividing this 


1 pre 1 
a eel. 
BN? and this is B2 because N > B. 


inequality by B gives |a — 4| < 


Proposition 1.16. For K = Q(./m) with m > 0, the units are the members 
of the infinite group 


R* = {(4]lef |n€ Z}=Zx Cy, 


where ¢€, is the fundamental unit, defined as the least unit > 1. 


6. Quadratic Number Fields and Their Units 37 


REMARK. For example, when m = 2, the fundamental unit is ¢; = 1 + “JT, 


PROOF. The units w with |w| = 1 are +1, since the members of K are real 
numbers. We shall show shortly that there exists a unit w with |w| ~¢ 1. Then w or 
w' has absolute value > 1. Letus say that |w| > 1. Thenoneofwand—wis> 1. 
Let us say that w > 1. Write ow = a+ b./m, so that o(@) = a — b./m = ta! 
has |o(w)| < 1. Then 


[2a] = |o + o(@)| < |o| + |o@)| < |o| + 1 


and |2b/m| = |w —o(@)| < |o| + |o()| < lol +1 


together show that there are only finitely many units w’ with 1 < |w’| < |]. 
Hence the existence of a unit @ with |w| ~ 1 implies the existence of a fundamental 
unit €). 

If w’ is any unit > 1, then we can choose a power e7 of €; with et >o' > et, 
by the archimedean property of R. Then we,” is a unit > 1 with |w’e,"| < e1. 
Since ¢€; is fundamental, o's," is 1, and thus w’ = e}. Then it follows that the 
group of units has the asserted form. 

Thus we need to exhibit some unit w with |w| 4 1. We apply Lemma 1.15 
with aw = ./m and with N arbitrary. Then we obtain infinitely many pairs (A, B) 
of integers with |,/m — 4| < + < 1, hence with |A/B| < 1+ ./m. For each 
such pair (A, B), the member r = A — B./m of R has 


[N(r)| = |(A + Bm)(A — Bym)| = | — Vm| |B?| |3 + Vm 
Fz BB? (1+ 2/m) =1+2/m. 


Thus there are infinitely many r in R with |N(r)| < 1+ 2./m. Since the norm 
of an algebraic integer is in Z, there is some integer n such that infinitely many 
r € R have N(r) = n. Among the elements r € R with N(r) = n, which 
we write asr = A+ B,/m with A and B in sZ, we consider the finitely many 
congruence classes of (A, B) modulo n, saying that two such (A, B) and (A’, B’) 
are congruent if A — A’ and B — B’ are integers divisible by n. Since infinitely 
many r € R have N(r) = n, there must be infinitely many of these in some 
particular congruence class. Take three such, say a, a2, and a3. Then 


IA 


N(q) = N(a2) = N(a3) =n 


with 
1 — a2 , a — 03, 
—— inR and ——— inR. 
n 


Since n = N(az) = a20 (a2), we see that 


ie (= ——))o (an). 


a2 


38 I. Transition to Modern Number Theory 


Thus @ /a@2 is exhibited as in R, and it has N(a@,/a2) = N(a1)/N(a@2) =n/n = 
1. Hence a /qaz is a unit different from +1. Arguing similarly with a,/a3, we 
see that a; /a3 is a unit different from +1 and not equal to a;/a2. Hence one of 
1/2 and a; /a3 is a unit whose absolute value is not 1. 


7. Relationship of Quadratic Forms to Ideals 


We continue with K as the quadratic number field Q(./m ) and R as the ring of 
algebraic integers in K. Here R = Z[5], where 5 = —./m if m = 2 or 3 mod 4 
and § = $(1 —./m) ifm = 1 mod 4. Let D be the field discriminant of Q(./m ) 
as defined in Section 6. 

The topic of this section is a relationship between nonzero ideals in R and 
binary quadratic forms with discriminant D. Binary quadratic forms with D as 
discriminant are automatically primitive. 

The relationship is not a one-one correspondence of ideals to forms but a one- 
one correspondence of a certain kind of equivalence class of ideals to proper 
equivalence classes of forms. We saw in Theorem 1.12 that the latter collection 
has the structure of a finite abelian group, and we shall see in this section that the 
former collection has the natural structure of a finite abelian group as well. The 
correspondence is a group isomorphism, according to Theorem 1.20 below. 

Consider nonzero ideals J in R. The first observation is that J is additively a 
free abelian group of rank 2. In fact, R itself is additively a free abelian group of 
rank 2, and the additive subgroup J has to be free abelian of rank < 2. Ifr isa 
nonzero element in J, then N(r) = ro(r) is in J, and thus J contains a nonzero 
integer. If n is an integer in /, then n./m is in J, and thus J contains a noninteger. 
Therefore / is a free abelian group of rank exactly 2, as asserted. 

Certainly J can then be generated as an ideal by two elements, and our cus- 
tomary notation has been to write J = (71, 72) in this case. However, without an 
extra condition on them, the two ideal generators need not together be a Z basis 
for I because they need not generate all of J additively. It will be helpful to have 
separate notation when the generators are known to give a Z basis. Accordingly 
we Shall write J = (r1,r2) when rj, rz give a Z basis of J. In this case it will 
be helpful also to regard the set {r1, 72} as ordered with r; preceding r2, and we 
shall often do so. 

Now suppose that J = (71, 2) is a nonzero ideal, and consider the expression 


rjo (ro) — o(r))r2 = det & a : 


r2 O(r2) 


If J is written in terms of a second ordered Z basis as J = (s,, 52), then the two 


7. Relationship of Quadratic Forms to Ideals 39 


ordered bases are related by a matrix C ) in GL(2, Z), the relationship being 
rr) _(@ B)\(s 
rm) \y 64 sy) 
ry o(ri)\ _(@ B\(s1 ols) 
r2 O(r2) Yo Op Xs 8)? 


ry o(ri)\ _ S1 o(S1) 
act (7! ct = dae(? Ae 


where +1 is the determinant of e : Consequently the expression 


Hence 


and therefore 


Iria (r2) — o(r1)ro| 


|\VD | ; 


where D is the field discriminant of K, is independent of the choice of Z basis. 
It is called the norm of the ideal J. The factor of /D in the denominator is a 
normalization factor that arranges for the norm of the ideal J = R to be 1; in fact, 
we can write R = (1, 5) with 6 as in the first paragraph of this section, and then 


NU) = 


| /m+./m | : a 
Nye OED Jim | ifm = 2 or 3 mod 4 ati 
= ; ; = 1. 
\VD | LOR) A0-Vl ifm = 1 mod 4 


Since the norm of an element of R is given by N(r) = ro(r), it is immediate 
from the definition that 


N(rl) =|N@)|NCW) forr € R. 
Consequently the norm of the principal ideal (r) is given by 
N((r)) = |N()|N(R) = |NO@)|1 = |NO)| forr € R. 


Still with J = (rj, 72), let us observe that 


o(ryo(r2) — o(ry)r2) = —(rio 72) — o(ri)r2). 
It follows that 
; real ifm > 0, 
ro (r2) — o(r1)r2 is ase 
imaginary ifm <0. 


40 I. Transition to Modern Number Theory 


Since r;0 (r2) — o (r1)r2 changes sign when r,; and r2 are interchanged, let us say 
that the expression J = (r,, 2) for J is positively oriented if r;o (72) — o(r))ro 
is positive or positive imaginary,’ negatively oriented if rj;a(r2) — o(71)ro is 
negative or negative imaginary. If 7 = (r;, r2), then exactly one of the expressions 
I = (r1,1r2) and I = (r2,1r1) is positively oriented. The notion of orientation will 
be critical to setting up the correspondence between classes of ideals and classes 
of forms. 

The set of nonzero ideals of R has a commutative associative multiplication 
that was introduced in Basic Algebra: if I and J are nonzero ideals, then J J is 
defined to be the set of sums of products from the two ideals, the product J J 
again being an ideal. Later in this section we shall recall some properties of this 
multiplication that were proved in Basic Algebra. 

We define two equivalence relations on the set of nonzero ideals of 7. We say 
that J and J are equivalent if there exist nonzeror and s in R with (r)/J = (s)J. 
Here (r) and (s) are understood to be principal ideals. The ideals 7 and J are 
strictly equivalent, or narrowly equivalent, if equivalence occurs and if r and 
s can be chosen with N(rs~!) > 0. Both relations are certainly reflexive and 
symmetric. To see transitivity, let (71)/; = (r2)J2 and (s2)bb = (83)/3. Then 
(risa), = (252) lo = (r253)43, and J; is equivalent to 3. If also N(niry') >0 
and N(s283') > 0, then the product N((r182)(r283)~') is positive, and J; is 
strictly equivalent to 3. In other words, “equivalent” and “strictly equivalent” 
are equivalence relations. 

The principal ideals form one full equivalence class under “equivalent.” First 
of all, (7) is equivalent to (s) because (s)(r) = (rs) = (r)(s). In the reverse 
direction, if J and (1) are equivalent, let (*)/ = (s). Then there exists x € J with 
rx =s. Hence sr~! is in J, and (sr~!) C J. In fact, equality holds: if y is in /, 
then the equality ry = sz with z in R says that y = (sr~!)z, and y is in (sr~'). 
In other words, J = (sr~!). 

In a sense, therefore, equivalence of ideals measures the extent to which 
nonprincipal ideals exist. 

Multiplication is a class property of ideals relative to equivalence and to 
strict equivalence. In fact, if (r)J = (r’)I' and (s)J = (s’)J’, then (rs)IJ = 
(r's’)I' J’, and the assertion follows. 

The theorem will be that multiplication of strict equivalence classes of ideals 
of R makes the set of such classes into an abelian group that is isomorphic to the 
finite abelian form class group of discriminant D. This result is not as beautiful as 
one might hope, since the identity class of ideals under strict equivalence need not 
match the set of all principal ideals. However, we can quantify the discrepancy. 
The relevant result is as follows. 


°If m <0, we adopt the convention that ./m is positive imaginary. 


7. Relationship of Quadratic Forms to Ideals 41 


Proposition 1.17. Equivalence and strict equivalence are the same for ideals 
of R if and only if either 
(a) m > 0 and the fundamental unit ¢; has N(e,) = —1 or 
(b) m <0. 
In the contrary case when m > 0 and the fundamental unit e; has N(e) = +1,a 
nonzero principal ideal (r) is strictly equivalent to (1) if and only if N(r) > 0; 
in particular, the principal ideal (./m ) is not strictly equivalent to (1). 


REMARKS. When m > 0, there are examples with N(¢;) = +1 and examples 
with N(eé1) = —1. Specifically when m = 2, €) = 1+ /2, and this has 
N(e,) = —1. When m > 0 and m has any odd prime divisor p with p = 
3 mod 4, then N(¢,) = +1; in fact, otherwise ¢; = x + y./m would imply that 
—1 = N(e,) = x*—my’ and therefore that —1 = x? mod p, but this congruence 
has no solutions by Theorem 1.2a. 


PROOF. Suppose thatm > Oand N(e,) = —1. If (r)J = (s)J with N(rs7!) < 
0, then (g;r)J = (s)J with N(ers~!) > 0. Thus equivalence implies strict 
equivalence in this case. 

Suppose that m < 0. Then all norms of nonzero elements are > 0. Hence 
N(rs—') > 0 is an empty condition, and equivalence implies strict equivalence. 

Conversely suppose that m > 0 and N(¢,) = +1. Proposition 1.16 shows that 
the most general unit ise = te}, and consequently N(e) = N(+1)N(e1)" = +1 
for every unit. The element ./m is in R, and N(./m) = —m < 0. We know 
that the principal ideals (1) and (,/m ) are equivalent. Arguing by contradiction, 
suppose that they are strictly equivalent. Then (r) = (r)(1) = (s)(./m) = 
(s,/m ) for some r and s with N(rs~!) > 0. Since the principal ideals generated 
byr and s./m are the same, these elements must be related byr = es./m for some 
unit e. Then N(rs~!) = N(e,/m) = N(e)N(/m) = —m < 0, contradiction. 
The proposition follows. 


Once we have introduced group structures on the set of equivalence classes of 
ideals and the set of strict equivalence classes of ideals, it follows that the map that 
carries a strict equivalence class to the equivalence class containing it is a group 
homomorphism onto. If either of the conditions (a) and (b) in Proposition 1.17 
is satisfied, then this homomorphism is one-one. Otherwise its kernel consists of 
the two strict equivalence classes of principal ideals — those whose generator has 
positive norm and those whose generator has negative norm. 

At this point we could establish that the set of strict equivalence classes of ideals 
is a finite abelian group. The finiteness of the set of strict equivalence classes 
could be established directly by a geometric argument we give in Chapter V, 
and the group structure could be derived from the group structure on the set of 
“fractional ideals” of K that were introduced in Problems 48-53 at the end of 
Chapter VIII of Basic Algebra. 


42 I. Transition to Modern Number Theory 


Although we could proceed with proofs along these lines, it is instructive to 
proceed in a different way. Rather than give a stand-alone proof of the finiteness 
of the number of strict equivalence classes of ideals, we prefer to derive this 
finiteness as part of the correspondence with proper equivalence classes of binary 
quadratic forms, since the number of such classes of binary quadratic forms has 
already been proved to be finite in Theorem 1.6a. The group structure then readily 
follows from this finiteness and the fact that R is a Dedekind domain. 

Let us pause for a moment, therefore, to use results we already know in order 
to show how the group structure on the set of strict equivalence classes follows 
once it is known that there are only finitely many such classes. We know from 
Theorems 8.54 and 8.55 of Basic Algebra that R is a Dedekind domain and that 
R has unique factorization for its nonzero ideals. In other words, in terms of the 
already-defined multiplication of ideals, each nonzero ideal / in R is of the form 
I= [4 Pe , where the P; are distinct nonzero prime ideals, the n; are positive 
integers, and k is > 0; moreover, this product expansion is unique up to the order 
of the factors. 


Lemma 1.18. Let 1 be the set of strict equivalence classes of nonzero ideals 
in R, with its inherited commutative associative multiplication. If 1 is finite, 
then 7 is a group under this multiplication. 


REMARKS. The group 7 will be seen in Theorem 1.20 to be isomorphic to the 
form class group of D. The set of ordinary equivalence classes is a quotient and 
is called the ideal class group of K. It will be generalized in Chapter V. 


PROOF. The identity element of 7 is the strict equivalence class of the ideal 
R = (1), and we are to prove the existence of inverses. Thus let J be given. For 
the sequence of ideals J, J 2. 1°,..., the finiteness of H shows that two of these 
ideals must be strictly equivalent. Suppose that J* is equivalent to J*+! for some 
k > Oand/ > 0. Then there exist nonzero principal ideals (7) and (s) such that 
(r)l* = (s)I**". The uniqueness of factorization of ideals implies that we can 
cancel I* from both sides of this equality, thereby obtaining (r) = (s)I'. Let us 
define an element ¢ in R. If N(rs~!) > 0, we take f to be 1. Otherwise m must 
be positive, and we let t = ./m, so that N(t) < 0. In both cases we then have 
(rt)(1) = (s)(t)I' with N(rts—!) > 0, and the ideal (t)J' is strictly equivalent 
to (1). Hence the strict equivalence class of (t)/ '—! is an inverse to the strict 
equivalence class of J, and 7 is a group. 


Now we define the mappings F and Z that we shall use to establish the main 
result of this section. Let J be a nonzero ideal in R, and suppose that J is given 
by an expression J = (rj, rz) that is positively oriented. We regard x and y as 
integer variables. To J, we associate the binary quadratic form 


F,ri,r2) = ND)! N(rix troy) = NG) ix +ry)(o(ri)x +. (r2)y). 


7. Relationship of Quadratic Forms to Ideals 43 


The associated 2-by-2 matrix for this form is 


1 ( 2r}o(r1) ea 
N() \rio (2) +r20 (71) 2r20 (r2) 


=a" ee “) 
N(I) \r2 o(r2) r| ry)? 


and the discriminant of the quadratic form is therefore 


L (mn of1)\ fen) o(72)\]_ = 2 
det | ay « a . )]=e@ (rio (r2) — o(r1)r2) 


(rio (r2) — o(ri)ro) 
rio (r2) — or )r2|” 


= |D|(sgnm) = D. 


= |D 


Thus we have associated a quadratic form F(/,7r1,1r2) of discriminant D to an 
ideal J when / is given by a positively oriented expression J = (r},1r2). If 
m < 0, this quadratic form is positive definite because the coefficient of x”, 
namely N()7!rjo(1) = NU)7!N (14), is positive when m < 0. 

In the reverse direction we associate to an arbitrary form (a,b,c) of dis- 
criminant D an ideal J = Z(a, b,c) given by a positively oriented expression 
(r1,r2). To begin with, if b is an integer with b = D mod 2, let us define b’ 
to be sb if D = 0 mod 4 and to be $(b — 1) if D = 1 mod 4; in other words, 
b= 5(b — Tr(6)) in both cases. The definition of Z is to be 


(a, b' + 6) ifa > 0, 
(6a, 5(b' +6)) fa <0. 


The right sides in the above display make sense as ideals if the angular brackets 
are replaced by parentheses. To see that the definitions make sense, we thus need 
to check that (a, b’ + 6) = (a, b’ + 6) for all a and that the orientations are 
positive. Lemma 1.19a below shows that (a, b’ + 6) = (a, b' + 8) if it is proved 
that a divides N(b’ + 5), and the computation that verifies this equality is 


N(b' +4) =b? +.b'(8 +.0(8)) + 60(5) 


Ta, b,c) = 


eg if D = 1 mod 4, 
~ [b?-m if D =0 mod 4, 

| i(b—1)+45(b-1)+40—D) if D=1 mod 4, 
7 ib? —iD if D = 0 mod 4, 
= 4(b° — D) 


44 I. Transition to Modern Number Theory 


From the definitions near the beginning of this section, the orientation of (r,, r2) 
is given by the sign of (./m)7! (rio (r2) — o(ri)r2). Thus 


orientation(a, b’ + 6) = sgn ((/m)~!a(a(5) — 5)) = senda, 
orientation(5a, 5(b’ + 6)) = sgn ((./m) | (Sao (5b' + 4”) — o (5)a5(b’ + 8))) 
= sgn ((/m)~'N(5)a(o(5) — 8)) = —sgna, 


and the orientations are positive in both cases. 


Lemma 1.19. 


(a) If a # 0 and D’ are integers such that a divides N(b’ + 8) in Z, then 
(a, b'+6) = (a, b’ +6) in the sense that the free abelian subgroup of R generated 
by a and b’ + 6 coincides with the ideal generated by a and b’ + 6. 

(b) If J is any nonzero ideal in R, then / is of the form 7 = (a,r) for some 
integer a > 0 and some r in R. 


PROOF. For (a), we are to show that J’ = Za + Z(b’ + 8) is closed under 
multiplication by the generators 1 and 6 of R. Closure of J’ under multiplication 
by | is evident, and the formula 6a = —b’a + a(b’ + 8) shows that 6(Za) C I’. 
Addition of 5b’ to the sum of the two formulas 6? = 5(6 + o(8)) — da(5) = 
6 Tr(5) — N(5) and N(b! + 8) = b? +b’ Tr(6) + N(6) yields 


5(b' + 6) = —N(b' + 6) + (B+ Tr(6))(H' + 8), 


which shows that 6(b’ + 6) C I’ because N(b’ + 5) is by assumption an integer 
multiple of a. 

For (b), we start from any Z basis {r;, 72} of 7, say with r; = a, + b,é6 and 
ry = a) + bod, and let d = GCD(b;, bz). Choose integers n; and nz with 
nib, +nob2 = d. Then GCD(1,, 2) = 1, and we can therefore find integers 


k, and kz with det (‘ ) = |. Consequently i = CG >) ) is anew Z 


ny, n2 ni n2 


basis of J of the form 
Ss) =e, +kdé, 


Sg = C27 dé. 


If we put a = s; — Ksz and possibly replace a by its negative, then {a, sz} isa Z 
basis of J of the required form. 


Theorem 1.20. The set 7 of strict equivalence classes of nonzero ideals 
relative to the field K = Q(./m) isa finite abelian group. Moreover, the mapping 
F that carries a positively oriented expression J = (rj, r2) for a nonzero ideal 
of R to a binary quadratic form depends only on J/, not the ordered Z basis, and 


7. Relationship of Quadratic Forms to Ideals 45 


descends to an isomorphism of the group 7{ onto the form class group H for 
the discriminant D of the field K, 1.e., the group of proper equivalence classes of 
binary quadratic forms of discriminant D, subject to the remark below. Moreover, 
the mapping Z with domain all binary quadratic forms whose discriminant equals 
the field discriminant of K , sending such a form to a positively oriented expression 
for a nonzero ideal of R, descends to be defined from H to H, and the descended 
map is the two-sided inverse of the isomorphism induced by F. 


REMARK. If m < 0, H is understood as usual to include only the classes of 
the positive definite forms. 


PROOF. The proof proceeds in six steps. 


Step 1. We show that the proper equivalence class of the quadratic form 
FU,11,12) depends only on the ideal 7, not the positively oriented expression 
I = (r,, rz) for it. Thus the class of the form can be abbreviated as F(/). 

Suppose that J = (s1, 52) is another positively oriented expression for 7. Then 


we can write (=) = C y ce) for a matrix é :) in GL(2, Z), and we have 


rot) \ _ (aB 51 (51) 
‘ a) = (; ) (i ) : (*) 


det @ od) avis (2 ae 


r2 o(r2) 52 0 ($2) 


seen that 


and that 


where +1 is the determinant of 6 : i Since both expressions J = (r,, 72) and 
I = (51, 82) are positively oriented, it follows that the sign in the determinant 


‘ is in SL(2, Z). Substituting from (*) into the 
formula for the matrix associated to the binary quadratic form FU, 7), 72), we 


obtain the matrix 
_-1 (ap 51 o(51) oa (51) o(S2) ay 
Ae, Cyto 18 VG i): “ 
The product of the coefficient N (7 )—! and the middle two matrices is the matrix 


associated to the quadratic form F(/, 5;, 52), and (*«) therefore exhibits the two 
quadratic forms as properly equivalent. 


equation is plus, hence that C 


Step 2. We show that the proper equivalence class F(/) does not change when 
we replace J by a strictly equivalent ideal. 

Thus let J = (r,,7r2) and J = (51,52) be expressions for J and J, and 
suppose that (7) and (s) are nonzero principal ideals such that (r)J = (s)J 
and N(s/r) > 0. The formula 


det (7 20) = ror) det (770) = wor) det (72 20 ) 


rr o(rr2) r2 o(72) r2 o(r2) 


46 I. Transition to Modern Number Theory 


shows that the expression (r)/ = (rr,,rr2) is positively oriented if N(r) > 0 
and is negatively oriented if N(r) < 0. Similarly (s)J = (551, ssz) is positively 
oriented if N(s) > O and is negatively oriented if N(s) < 0. Since N(r/s) > 0, 
N(r) and N(s) are both positive or both negative. Possibly replacing r and s by 
r./m and s,/m, we may assume that N(r) and N(s) are both positive. Then the 
matrix associated to the quadratic form F((r)/, ri, rr2) is 


-1 (rr}) \ ( or) or) 
NOE) 
= -1 (ror) r 0 o(r) 0) ( 71) 62) 
=N(rl) @ a) (; Za) 0 ) noon 


= Nerden (123) (20) 


ry 12 


= INNING) (100) (209 2) 
=; —1 fr; o(7) o(r}) o(r2) 
=F Cas oe 8 Gana 
while the matrix associated to F((s) J, 551, 552), by a similar computation, is 
-1 / 51 o(s1) a (81) o(s2) 
N(J) C Ao ( S] $2 ) 
Since (r)I = (s)J, Step 1 shows that F((r)/, rr,, rrz) is properly equivalent to 
F((s) J, 851, 882). 
Step 3. We show that Z(a, b, c) depends only on the proper equivalence class 


of the binary quadratic form (a, b, c). 


Problem 37 at the end of Chapter VII of Basic Algebra shows that SL(2, Z) 


is generated by a = (4) and B = e oe hence by a8 = é 3) and 


a e a Thus it is enough to handle a8 and a~!. 


The operation of a6 = ( : i) on forms sends (a,b,c) into the translate 


(a,b+2a, *). Define b’ = $(b — Tr(6)) in the same way as when Z was defined. 
Ifa > 0, then Z(a, b, c) = (a, b' +4), and Z(a, b+ 2a, *) = (a, (b+2a) +6) = 
(a, b’+a+56); thus the two image ideals are the same. Ifa < 0, then the respective 
images are (5)(a, b’ + 5) and (4)(a, b’ +a +4), and again the image ideals are 
the same. 

To handle a~! = Bea! 


Z(c, —b, a) are strictly equivalent. We saw just after the definition of Z that 
N(b' + 5) = ac. There are four cases to the proof of the strict equivalence 
according to the signs of a and c. Let us use the symbol ~ to denote “is strictly 
equivalent to.” 


we are to show that the ideals Z(a, b,c) and 


7. Relationship of Quadratic Forms to Ideals 47 
Suppose that a > 0 and c > 0, so that N(b' + 5) > 0. Then 
T(a, b,c) = (a, b’ + 8) ~ (b' +0 (8))(a, b' + 8) = (a(b’ + 0 (5)), N(B’ + 8)) 
= (a(b’ +o (8)), ac) = (a)(b' +o (8), c) 
~ (c, b' +0(8)) = (c, —b' — 0 (8)) = (c, (—b)' + 8), 


the last equality holding because b’ + (—b)’ = — Tré = —é — o (5). The right 
side equals Z(c, —b, a), and the strict equivalence is proved in this case. 
Suppose that a < 0 andc < 0, so that N(b' + 5) > 0. Then 


Ta, b,c) = (6)(a, b' +8) ~ (b' +0 (8))(5)(a, b’ +8) 
= (5)(a(b' + o (6)), N(B' + 8)) = (6)(ah' + o(5)), ac) 
= (a)(5)(b' + 0(5), c) ~ (6)(c, b +o (8)) 
= (5)(c, —b’ — o(6)) = (6)(c, (—b)' + 8) = Lc, —b, a), 
and the strict equivalence is proved in this case. 
Suppose that a > 0 and c < 0, so that N(b’ + 6) < 0. Then N(5)N(b' + 4) 
is positive, and 
L(a, b,c) = (a, b’ + 8) ~ (5)(b' + 0 (5))(a, BD’ + 8) 
= (6)(a(b' + o(6)), N(b' + 8)) = (8)(a(h’ +0 (6)), ac) 
= (a)(6)(b' + o(5), c) ~ (6)(c, b' +. (5)) = (8)(c, —b' — 6 ()) 
= (5)(c, (—b)' +8) = Lc, —b, a), 
and the strict equivalence is proved in this case. 
Suppose that a < 0 andc > 0, so that N(b’ +6) < 0. Then N(5)~!N(b' +6) 
is positive, and 
T(a, b,c) = (6)(a, b’ +8) ~ (b' +0 (5))(a, b' + 8) 
= (a(b’ + o(5)), Nb’ + 8)) =(a(b' + 0 (5)), ac) =(a)(b' + 4 (5), c) 
~ (c, —b’ — o(6)) = (c, (—b)' + 8) = Lc, —b, a), 


and the strict equivalence is proved in this case. 


Step 4. We show that the mapping of the set H of proper equivalence classes 
of forms to itself induced by FZ is the identity. 

Let the given form be (a, b, c). With b’ defined to be $(b — Tr(6)) as usual, 
we have seen that N(b’ + 6) = ac. Therefore a divides N(b’ + 5), and Lemma 
1.19a shows that (a, b'+ 5) = (a, b’ +4) in the sense that the ideal generated by 
a and b' + 6 matches the free abelian group generated by these two elements. 


48 I. Transition to Modern Number Theory 


First suppose that a > 0. Then Z(a, b, c) = (a, b’ + 8) = (a, b’ + 8), and we 
know that this expression is positively oriented. Calculation gives 


N() 


= 
\VD | | det (145 w+450) ) | 


a\VD| |o(8) — 8)| 
|/m |/|/m | if D = 1 mod 4, 
~ tS ee if D =0 mod 4, 
=a. (t) 


Therefore the quadratic form F Z(a, b, c) is 


N(1) "(ax + (b' +. 8)y)(ax + (b' +.0(5))y) 
=a! (a’x? +.a(2b' + (6 +.0(5)))xy + N(b’ +8)y’) 
= ax’ (2b' + Tr(5))xy + cy” 


= ax? + bxy +cy’, 


and we see that FZ(a, b,c) = (a, b,c) whena > 0. 

Next suppose that a < 0. Then Z(a, b, c) = (6a, 6(b’ +8)) = (6a, 6(b' + 8)), 
and we know that this expression is positively oriented. Since a < 0 cannot occur 
form <0, N(6) is negative. Thus calculation gives 


N(1) = N((8)(a, b' + 8)) = N((5)(—a, b’ + 8)) = |N(8)|N ((—a, BD’ + 8)) 
=|N©)|lal = N()a, 


the next-to-last equality following from the calculation that gives (+). Therefore 
the quadratic form F Z(a, b,c) is 


NI)! (adx + (b' + 8)5y) (ao (8)x + (b' + 0 (5))o (5) y) 
= N(I)"'N(6)(ax + (b' + 8)y)(ax + (b' +.0(8))y) 
=a7!(a?x? + a(2b' + (6 + o(8)))xy + NU’ + 8)y’) 
= ax? + (2b' + Tr(5))xy + cy? 


=ax* + bxy +cy’, 


and we see that FZ(a, b,c) = (a, b,c) whena < 0. 


7. Relationship of Quadratic Forms to Ideals 49 


Step 5. We show that the mapping of the set H of strict equivalence classes of 
ideals to itself induced by Z F is the identity. In view of Step 4, it follows that F 
and Z are both one-one onto. Since Theorem 1.6a shows H to be finite, 1 has to 
be finite, and Lemma 1.18 shows that the multiplication on 1 makes 1 into an 
abelian group. 


Let an ideal J be given, and apply Lemma 1.19b to write J = (a, r) witha > 0 
an integer. The expression deciding orientation is do (r) — 0 (a)r = a(o(r) —r), 
and this is multiplied by —1 ifr is replaced by —r. Possibly changing r to —r in 
the expression for 7, we may therefore assume that the expression J = (a, r) is 
positively oriented. Write r = c + dé. Then 


coarado@)-9= {WYP tm eto nels) ap 


The orientation of / is given by G(a (r) —r) = Gd/D, and we deduce that d > 0 
and that 
NW) =|VD/"'alo(r) —r| = ad. 


The definition of F gives FI, @,r) = N(1)~!N(@x-+ry), which is a quadratic 
form whose x? coefficient is a = N(I)~'a@* = d~'@ and whose xy coefficient is 


b= N(1)'@Tr(r) = d7! Tr(r) = d7! (2c + d Tr(8)) = 2d7!c + Tr(8). 


With b’ defined as usual to be b’ = $(b — Tr(5)), we see that b’ = doc. 
Consequently ZF(U/,a,r) = (a,b' + 6) = (d~'a,d~'c + 5). The product of 
this ideal with (d) is (@,c + dé) = (a4,r) = I, and thus Z F(/, a, r) is strictly 
equivalent to /. 

Step 6. We show that the mapping induced by Z from the set H of proper 
equivalence classes of forms to the set 7 of strict equivalence classes of ideals 
respects the group operations in H and 1 and hence is an isomorphism. 

Let two proper equivalence classes of forms with discriminant D be given, 
and use Theorem 1.12a to choose representatives (a,b,c) and (a, b,c) with 
GCD(a, a) = 1. The composition of the forms is well defined and is (aa, b, *) 
for a suitable third entry in Z. Let b’ be $(b — Tr(d)) as usual. We divide matters 
into cases according to the signs of a and a. 

Suppose that a > 0 anda@ > 0. The definition of Z shows that the ideals 
corresponding to the three quadratic forms in question are 


(a,b' +5), (G@,b'+5), and (aa,b’ +8). 


The product of the first two ideals is (aa, a(b’ + 5), a(b’ +5), (b! + 5)"), and we 
are to show that this equals (aa, b’ + 5). In fact, the inclusion 


(aa, a(b' + 5), ab’ + 8), (b' +8)*) C (aa, b’ +8) 


50 I. Transition to Modern Number Theory 


is clear. For the reverse inclusion we use the fact that GCD(a, a) = 1 to write 
kia +k.a@ = 1 for suitable integers k; and ky. Then we see that b' + 5 = 
ki (a(b! + 8)) + kn (a(b' + 8)), and the reverse inclusion follows. 

Suppose that a and @ are of opposite sign. By symmetry we may assume that 
a > Oanda < 0. The three ideals are then 


(a,b’ +5), (a5, (b' +5)6), and (aa8, (b' + 6)6), 


while the product of the first two ideals is (aaé, a(b'+6)65, a(b’+6)6, (b'+6)?6) = 
(5)(aa, a(b’ + 5), a(b’ + 8), (b' + 6)”). From the previous paragraph this last 
ideal equals (5)(aa, b’ + 5) = (aaé, (b’ + 5)8), and we have the required match. 

Suppose that a < 0 anda < 0. This time the product ideal is given by 
(ad, (b' + 8)5)(G6, (b! + 8)8) = (5°) (aa, a(b’ + 8), a(b’ + 8), (bo + 5)°) = 
(5’)(aa, b' + 5), the second equality following from the computation in the 
paragraph for a and @ both positive. The ideal (57) (aa, b’ +6) is strictly equivalent 
to (aa, b' + 5) because N(5*) = N(65)? is positive. Thus we have the required 
match on the level of strict equivalence classes. We conclude that the mapping 
of H to 1 is a group isomorphism. 


8. Primes in the Progressions 4n + 1 and 4n +3 


This section is the first of three sections about Dirichlet’s Theorem on primes in 
arithmetic progressions, whose statement is as follows. 


Theorem 1.21 (Dirichlet’s Theorem). If m and b are relatively prime integers 
with m > 0, then there exist infinitely many primes of the form km + b with k a 
positive integer. 


We begin with the earlier treatment of the arithmetic progressions 4n + 1 and 
4n + 3 by Euler. In 1737 Euler made the stunning discovery of the formula 


a 1 
ae ~ IT ip 
n=1 Pp prime 


valid for s > 1. Actually, the formula is valid for complex s with Res > 1, but 
Euler had not considered powers n* with s complex by this time and did not need 
them for his purpose. Euler’s formula is a consequence of unique factorization 
of integers. In fact, the product for p < N is 


1 1 
eee 


psN 


1 
ps 


1 
n with it 
no prime 


divisors > N 


8. Primes in the Progressions 4n + 1 and 4n +3 51 


Letting N — oo, we obtain the desired formula. 

Built into the formula is the result of Euclid’s that there are infinitely many 
primes, i.e., infinitely many primes in the arithmetic progression n. There are 
two ways to see this. In both cases one starts from the observation that the sum 
we, L/n’ is > f° C/x*) dx = 1/(s — 1), from which it follows that the sum 
tends to infinity as s decreases to 1. In one case the argument continues with the 
observation that if there were only finitely many primes, then | | aoane a would 
certainly have finite limit as s decreases to 1, and we arrive at a contradiction. 
In the other case the argument continues with the observation that the logarithm 
of a is comparable in size to 1/p*, hence that log )°~ , 1/n* is comparable 
to >, prime !/p’. Since }°>; 1/n* tends to infinity, ¥°,, prime 1/p* must tend to 
infinity, and we conclude that there are infinitely many primes. We shall return 
to this observation shortly in order to justify it more rigorously.!° 

Euclid’s proof was much simpler: if there were only finitely many primes, 
then the sum of 1 and the product of all the primes would be divisible by none of 
the primes and would give a contradiction. The difficulty with Euclid’s argument 
is that there is no apparent way to adapt it to treat primes of the form 4n + 1. 
Euler’s argument, by contrast, does adapt to treat primes 4n + 1. 

Before continuing, let us make rigorous the notion of comparing sizes of factors 
of an infinite product with terms of an infinite series. An infinite product []7°, cn 
with c, € C and with no factor 0 is said to converge if the sequence of partial 
products converges to a finite limit and the limit is not 0. A necessary condition 
for convergence is that c, tend to 1. 


Proposition 1.22. If |a,| < 1 for all n, then the following conditions are 
equivalent: 


(a) [[P2,C. + lanl) converges, 
(b) 5°°°., |an| converges, 


(c) []p2,C — lanl) converges. 
In this case, [J (1 + a,) converges. 


PROOF. Condition (c) is equivalent to 
(c') [72,1 — lan|)~! converges. 


For each of (a), (b), and (c’), convergence is equivalent to boundedness above. 


Since 
N 


N N 
1+ ¥ lanl < I + lee) ST 
n=1 


n=1 n=1 


!0Tn fact, this argument is showing that 5~ 1/p diverges, which says something more than just 
that there are infinitely many primes. 


52, I. Transition to Modern Number Theory 


we see that (c’) implies (a) and that (a) implies (b). To see that (b) implies (c’), 
we may assume, without loss of generality, that |a,| < 5 for all n. Since |x| < 4 


2 
implies that 


1 d 1 1 
log + <\|x| sup [flog |=Ix| sup (74) <2ixl, 
Itl<IxIs Ith<|xl<3 


we have 


Thus (b) implies (c’). 
Now suppose that (a) holds. To prove that [[°—, (1+an) converges, itis enough 


to show that i ies m( +n) tends to 1 as M and N tend to oo. In the expression 


’ 


N 
I] G+a,)-1 
n=M 


we expand out the product, move the absolute values in for each term, and 
reassemble the product. The result is the inequality 


N 
Il (+a,)—1 
=M 


N 
< J] d+ lal) —-1. 
=M 


n n 


By (a), the right side tends to 0 as M and N tend to oo. Therefore so does the 
left side. This proves the proposition. 


Using this proposition and its proof, we can give a more rigorous justification 
for the comparison of log )°°°., n~* and > p prime P > in Euler’s argument. An- 
ticipating the notation that Riemann was to use for the function a century later, 


we introduce 
oo 
Cs)= s ne? 
‘i 


at the moment just for real s with s > 1. (This function subsequently was 
named the Riemann zeta function and is defined and analytic for complex s 
with Res > 1. We postpone a more serious discussion of ¢(s) to Proposition 
1.24 below.) We begin from the formula 


logg(s)= Yo logyta= Y (S+eetaet--). 


Pp prime p prime 
Let us see that this expression equals 


>. + + bounded term ass | 1. 
Pp prime , 


8. Primes in the Progressions 4n + | and 4n +3 53 


Going over the second displayed line in the proof of Proposition 1.22, which 
applied when |x| < s, we have 
logy —x]<lal sup |a(log + —1)| 
Itl<lxl<3 
=|x| sup | -1,/=|x| sup | 4] <2lx/. 
It|<|x1S Itl<|xl<3 


I 
2 


For x = p * with s > 1, this inequality becomes 


fn a TE 
AY 


~2: 
hog ee he 2p >, 


Consequently 


Hee ci De =| eee ae 


Pp prime Pp prime 
1 1 —2s 
=) De [eee Se Sy 
p prime p prime 


The right side is < 2 °°, n~? for all s > 1, and we arrive at the desired formula 


1 
log f(s) = De — + bounded term ass | 1. 


p prime 


Since we know that log ¢(s) increases without bound as s decreases to 1, we 
can immediately conclude that there are infinitely many primes in the arithmetic 
progression 7. 

With this argument well understood as a prototype, let us modify it to treat 
primes 4k + 1 separately from primes 4k + 3. Euler needed one further key idea 
to succeed. It is tempting to replace the sum over all primes of p~* in the above 


argument by 
Ge ee 


Pp prime, Pp p prime, Pp 
p=! mod4 p=3 mod 4 
trace backward, and see what happens. What happens is that the expansion of 
the corresponding product of (1 — p~*)~! as a sum does not yield anything very 
manageable. For example, with the first of the two sums, we are led to the 
logarithm of the series }°°°_, c(n)n~*, where c(n) is 1 if n is a product of primes 
4k + 1 and is 0 otherwise, and we have no direct way of deciding whether this 
diverges or converges as s decreases to 1. 


54 I. Transition to Modern Number Theory 


Euler’s key additional idea was to work with the sum and difference of the 
displayed series, rather than the two terms separately, and then to recover the two 
displayed series at the end. Let us see what this idea accomplishes. Tracing back- 
ward in the derivation of the formula log ¢(s) = )0,, prime P_ > + bounded term, 
we want to obtain a series }> p prime 4pP * from the logarithm of a product 
Il p (l—4Gp p~*)~! and be able to recognize this product as equal to a manageable 
series )-~_, b,n~*. Guided by what happens for ¢(s), we can hope that b, will be 
readily computable from the a,’s and the unique factorization of n. The relevant 
identities, which we shall verify below, are as follows: 


n odd p prime, = pr 
p odd 
(—1)2@-D 1 1 
Dd ns =i( IT | IT ae 
n odd Pp prime, p prime, 
p=4k+1 p=4k+3 


In more detail let us write 
i= 0 ifn=O mod 2, 
XO) =) 1 ifn =1 mod 2, 


0 ifn=O mod 2, 
nin =| 1 ifn=1 mod4, 
—1 ifn=3 mod 4. 


With x equal to xo or x1, we have x(mn) = x(m)x(n) for all m and n. 


x (n—-l 
Consequently the two expressions )>,, oaq + and ) >, oda ite are both of 
the form g 
x(n) 
Li.x= >. 
n=1 


the function x being xo for the first series and being x, for the second series. As 
we shall verify rigorously in the next section, the same argument via unique 


factorization that yields Euler’s identity )°72 27° = Do, prime op gives a 
factorization 
x (1) ll 1 
OS | 
n=1 n Pp prime 1- x(P)P : 


because of the identity x(mn) = x(m)x(n). Going over the argument that 


log ¢(s) is the sum of )° pein p * and a bounded term, we find that 


X(p) 


S 


logLs, x)= YO 


p prime 


BS) 


8. Primes in the Progressions 4n + | and 4n + 3 55 


with g(s, x) bounded as s | 1. The sum and difference for the two choices of 
x(n) gives 


1 
log(Lis, xo) L(s, x1)) =2 J) — + (86s, xo) + 8s, x1)) 


Pp prime 
p=4k+1 
and 


2 1 
log(L(s, Xo) L(s, x1) ') =2 D> — +4 (a(S, X0) — BCs, x1). 
paak3 


The function L(s, xo) is the product of ¢(s) and an elementary factor. In fact, 
a change of index of summation in the formula defining ¢(s) gives 2-*¢(s) = 
on even? *. Subtracting this formula from the definition of ¢(s) gives 


1 
L(s, x0) = DY = = (= 2*)56). 


a= 
n odd 


Therefore 
TAS: Xo) = +00. 


; ‘ =120-) , 

Meanwhile, the series L(s, x1) = )>, oda Cb is alternating and converges 
for s > O by the Leibniz test. The convergence is uniform on compact sets, and 
the sum L(s, x1) is continuous for s > 0. Grouping the terms of this series in 


pairs, we see that L(1, x1) is positive.!! Hence we have 


0< TEE x1) < +o. 
Ss 


Putting together the two limit relations for L(s, xo) and L(s, x,) as s decreases 
to 1, we see that 


log (L(s, xo)L(s,x1)) and dog (L(s, xo) L(s, x1) ') 


both tend to +oo as s | 1. Referring to the values computed above for these 
expressions and taking into account that )7 1/p exceeds }°1/p* whens > 1, 


we see that i i 
) — and ) — 
p prime P Pp prime P 

p=4k+1 p=4k+3 


'lWe can even recognize the value of L(1, x1) as 7/4 from the Taylor series of arctan x, but the 
explicit value is not needed in the argument. 


56 I. Transition to Modern Number Theory 


are both infinite. Hence there are infinitely many primes 4k + 1, and there are 
infinitely many primes 4k + 3. 

The proof of the general case of Dirichlet’s Theorem (Theorem 1.21) will 
proceed in similar fashion. We return to it in Section 10 after a brief but systematic 
investigation of the kinds of series and products that we have encountered in the 
present section. 


9. Dirichlet Series and Euler Products 


A series ae a,n * with a, and s complex is called a Dirichlet series. The 
first result below shows that the region of convergence and the region of absolute 
convergence for such a series are each right half-planes in C unless they are equal 
to the empty set or to all of C. These half-planes may not be the same: for 
example, °° ,(—1)"n7° is convergent for Res > 0 and absolutely convergent 
for Res > 1. 


Proposition 1.23. Let )°--_, a,n~* be a Dirichlet series. 


(a) If the series is convergent for s = so, then it is convergent uniformly on 
compact sets for Re s > Re So, and the sum of the series is analytic in this region. 

(b) If the series is absolutely convergent for s = so, then it is uniformly 
absolutely convergent for Res > Re So. 

(c) If the series is convergent for s = so, then it is absolutely convergent for 
Res > Reso + 1. 

(d) If the series is convergent at some so and sums to 0 in a right half-plane, 
then all the coefficients are 0. 


REMARK. The proof of (a) will use the summation by parts formula. Namely 
if {u,} and {v,} are sequences and if U, = em uz forn >0,thenl1<M<WN 
implies 

N N-1 
2 HnUn = dX Unn — Un+1) + Unvn — Uy-ivm. (*) 


PROOF. For (a), we write a,n~*° = a,n~ -n~°—) = u,v, and then apply the 
summation by parts formula (*). The given convergence means that the sequence 
{U,,} is convergent, and certainly v, tends to 0 uniformly on any proper half-plane 
of Res > Reso. Thus the second and third terms on the right side of () tend 
to 0 with the required uniformity as M and N tend to oo. For the first term, the 
sequence {U,,} is bounded, and we shall show that 


2 Wn = Prtil = i ea 
n= 


n=1 


9. Dirichlet Series and Euler Products 57 
is convergent uniformly on compact sets for which Res > Reso. Use of (*) and 
the Cauchy criterion will complete the proof of convergence. Forn <t <n+1, 


we have 


|n— S80) _ t~ 8-80) < sup | 4-6) _ t—6—0))| 
1 


n<t<n+ 
— S—SO |s—so| 
— sup ps1 | — yltRe(s—sg) * 
n<t<n+l 
Thus 
= —(s—si —(s—si |s—so| 
lUp — Unga] = [nS — (n+ 1) terres coaeye 


and )° °° | |Un — Un+1| is uniformly convergent on compact sets with Re s > Re so, 
by the Weierstrass M-test. It follows that the given Dirichlet series is uniformly 
convergent on compact sets for which Res > Reso. Since each term is analytic 
in this region, the sum is analytic. 

For (b), we have 


< | an 
= |n0 


Since the sum of the right side is convergent, the desired uniform convergence 
follows from the Weierstrass M-test. 
For (c), let € > 0 be given. Then 


On, 
notte 


with the first factor on the right bounded and the second factor contributing to a 
finite sum. Therefore we have absolute convergence at spo + 1+ €, and (c) follows 
from (b). 

For (d), we may assume by (c) that there is absolute convergence at sy. Suppose 


that a) = --- = ay_) = 0. By (b), °° ann = 0 for Res > Reso. The 
series 
lee) 
d= an(n/N)* (2) 
n=N 


is by assumption absolutely convergent at so, and Res > Re so implies 
|an(n/N)*| < |an(n/N)~|. 
By dominated convergence we can take the limit of (*>*) term by termas s — +00. 


The only term that survives is ay. Since (**) has sum 0 for all s, we conclude 
that ay = 0. This completes the proof. 


58 I. Transition to Modern Number Theory 


Proposition 1.24. The Riemann zeta function ¢(s) = paar n~*, initially 
defined and analytic for Res > 1, extends to be meromorphic for Res > 0. Its 
only pole is at s = 1, and the pole is simple. 


REMARK. Actually, ¢(s) extends to be meromorphic in C with no additional 
poles, but we do not need this additional information. 


PROOF. For Res > 1, we have 
1 Sg Sy ee ee | 
= fendi =F fede 
n= 
Thus Res > 1 implies 
CO lee) 
6) = th (e-Getat) = At fort —eae. 
n=1 n=1 


It is enough to show that the series on the right side converges uniformly on 
compact sets for Res > 0. Thus suppose that Res > o > O and |s| < C. The 
proof of Proposition 1.23a showed that |n~* — t~*| < |s|n~@+R°), Hence 


ferns = ede] < fh" in-s — 1-8] dt = |s| a OFRED = Co“), 


Since °° ,n~"t®) < oo, the desired uniform convergence follows from the 
Weierstrass M-test. 


Proposition 1.25. Let Z(s) = }°p2., ann be a Dirichlet series with all 
an = 0. Suppose that the series is convergent in some half-plane and that the sum 
extends to be analytic for Res > 0. Then the series converges for Res > 0. 


PROOF. By assumption the series converges somewhere, and therefore so = 
inf {s >0 | een ayn * converges} is a well-defined real number > 0. Arguing 
by contradiction, suppose that so > 0. Since }> a,n~* converges uniformly on 
compact sets for Res > so by Proposition 1.23a and since the terms of the series 
are analytic, we can compute the derivatives of the series term by term. Thus 


oe) 
Z™ (59 +1) = s: Gn (= log ny" (x) 


not! 
n=1 


The Taylor series of Z(s) about so + 1 is 


CO 
Z(s) = LO yl — 80 — DNZ™ (50 + YD 
N=0 


9. Dirichlet Series and Euler Products 59 


and is convergent at s = 5505 since Z(s) is analytic in the open disk centered at 


So + 1 and having radius so + 1. Thus 
CO 
Z(450) = Yo AC + $50)% (—1I)X Z) (so + 1), 
N=0 
with the series convergent. Substituting from (+), we have 


Z(4s0) = YY lH" (+ bsg), 
N=0 n=1 


This is a series with terms > 0, and Fubini’s Theorem allows us to interchange 
the order of summation and obtain 


Ae: = ee ole cami so) a ,(logn)(1+580) — —159 
(550) = 2 Ss nortl = notl e => nn ?. 
n=1 N=0 — 


In other words, the assumption sp > 0 led to a point between 0 and so (namely 550) 
for which there is convergence. This contradiction proves that so = 0. Therefore 
yo. dann converges for Res > 0. 


We shall now examine special features of Dirichlet series that allow the 
series to pave product expansions like the one for ¢(s), namely )°°° , nS = 
Vl apeneties pa = . Consider a formal product 


Il (1+ app +++++ apn p7™ ++++). 


p prime 


Ifthis productis expatide’ without regard to convergence, the result is the Dirichlet 


series ye 1 Ant *, where a, = | and a, is given by 
— eee j — ry eee Tk 
An = Ayn + Ayr ifn = p, Py 


Suppose that the Dirichlet series }°-. | dnn~ is in fact absolutely convergent in 
some right half-plane. Then every rearrangement is absolutely convergent to the 
same sum, and the same conclusion is valid for subseries. If E is a finite set of 
primes and if N(£) denotes the set of positive integers requiring only members 
of E for their factorization, then we have 


Tl d app ob age po +.-+-) = se ayn. 
peE neN(E) 


Letting E swell to the whole set of positive integers, we see that the infinite 
product has a limit in the half-plane of absolute convergence of the Dirichlet 


60 I. Transition to Modern Number Theory 


series, and the limit of the infinite product equals the sum of the series. The sum 
of the series is 0 only if one of the factors on the left side is 0. In particular, the 
sum of the series cannot be identically 0, by Proposition 1.23d. Thus the limit of 
the infinite product can can be given by only this one Dirichlet series. 

Conversely if an absolutely convergent Dirichlet series )°°° , ann~* has the 
property that its coefficients are multiplicative, i.e., 


a,=1 and dgn=G4nd, Whenever GCD(m,n) = 1, 


then we can form the above infinite product and recover the given series by ex- 


1 bY r r, 
panding the product and using the formulaa, = Apr +++ Ayrk whenn = pj --: pit. 


In this case we say that the Dirichlet series )°°° , a,n~* has the infinite product 
as an Euler product. Many functions in elementary number theory give rise 
to multiplicative sequences; an example is a, = y(n), where ¢ is the Euler g 
function. 

If the coefficients are strictly multiplicative, i.c., if 


aq, =1 and dgn=4ma, for allm andn, 


then the p™ factor of the infinite product simplifies to 


1 
1 use eee ao a ee rs 
Gpp PRs gp ye (a= 


As a consequence we obtain the following proposition. 


Proposition 1.26. If the coefficients of the Dirichlet series )°°° , ann~* are 
strictly multiplicative, then the Dirichlet series has an Euler product of the form 


SA dn 1 
pe Oe 


ee —s? 
n=1 p prime ! Cp? 
valid in its region of absolute convergence. 


REMARK. We refer to the kind of Euler product in this proposition as a first- 
degree Euler product. 


This is what happens with ¢(s), for which all the coefficients are 1, and with 
An = Xo(n) and a, = x,(n) as in the previous section. Conversely an Euler 
product expansion of the form in the proposition forces the coefficients of the 
Dirichlet series to be strictly multiplicative. 

A Dirichlet series bien ann * with |a,| < n° for some real c is absolutely 
convergent for Res > c+ 1. This fact leads us to a convergence criterion for 
first-degree Euler products. 


10. Dirichlet’s Theorem on Primes in Arithmetic Progressions 61 


Proposition 1.27. A first-degree Euler product [| (1 — a, p)7! with 
|ay| < p* for some real c and all primes p defines an absolutely convergent 
Dirichlet series for Res > c + 1 and hence a valid identity }°°° , a,n7* = 


Tvee l= app *)—' in that region. 


PROOF. The coefficients a, are strictly multiplicative, and thus |a,| < n° for 
alln. The absolute convergence follows. 


10. Dirichlet’s Theorem on Primes in Arithmetic Progressions 


In this section we shall prove Dirichlet’s Theorem as stated in Theorem 1.21. 
Recall from Section 8 that the proof of Dirichlet’s Theorem for the progressions 
4n + 1 and 4n + 3 required taking the sum and difference of two expressions, 
working with them, and then passing back to the original expressions. Generaliz- 
ing this step involves recognizing this process as Fourier analysis on the 2-element 
group (Z/4Z)*. This kind of Fourier analysis was discussed in Section VII.4 
of Basic Algebra. Let us begin by reviewing what is needed from that section 
of Basic Algebra and then pinpoint the Fourier analysis that was the key to the 
argument in Section 8. 

Let G be a finite abelian group, such as (Z/mZ)*. A multiplicative character 
of G is ahomomorphism of G into the circle group S '¢C%. The multiplicative 
characters of G form a finite abelian group G under pointwise multiplication: 


(xx')(g) = x(g)x'(g). 


In this setting we recall the statement of the Fourier inversion formula. 


THEOREM 7.17 OF Basic Algebra (Fourier inversion formula). Let G be a 
finite abelian group, and introduce an inner product on the complex vector space 
C(G, C) of all functions from G to C by the formula 


(F, F’) = >0 F(g)F(), 


geG 


the corresponding norm being || F'|| = (F, F) \/2. Then the members of G form an 
orthogonal basis of C(G, C), each x in G satisfying || x ||? = |G|. Consequently 
|G| = |G|, and any function F : G — C is given by the “sum of its Fourier 


series”: f 
Fe) = aq L(Y Fw) x. 


xeG eG 


62 I. Transition to Modern Number Theory 


EXAMPLE. With the two-element group G = {+1}, there are two multiplicative 
characters, with y9(+1) = xo(—1) = 1, x1(41) = 1, and x;(—1) = —1. We 
can think of the Fourier-coefficient mapping as carrying any complex-valued 
function F on G to the function F on G given by FO) = = nee F(h)x(h). 
The inversion formula says that F is recovered as F = 5 1(F (xo) X0 + Fi (x1) x1). 
A basis for the 2-dimensional space of complex-valued uhetions on G consists 
of the two functions F* and F~, with F* equal to | at +1 and 0 at —1 and 
with F~ equal to O at +1 and 1 at —1. The multiplicative characters are given 
by xo = Ft + FO end xX, = FY - F7 por these two functions the inversion 
formula reads Ft = - 1 yo + x1) and F~ =5 1 (x9 — x1). In Section 8 the roles of 
F* and F~ are played by functions of s, not by scalars, with F*+ corresponding 
to ))p=1mod4 P° and F~ corresponding to )/,-3moaa P °. We are to consider 
the functions of s corresponding to their sum xo and to their difference x;. The 
results of Section 9 show that these are the series that come from Euler products. 
The role of the Fourier inversion formula is to ensure that we can reconstruct 
de p=imods P* and D753 moa4 P * from the sum and difference. The general 
proof of Dirichlet’s Theorem is a direct generalization of this argument form = 4. 


Fix an integer m > 1. A Dirichlet character modulo m is a function 
x :Z— S! U {0} such that 


G) x(j) = Oif and only if GCD(j, m) > 1, 
(ii) x(j) depends only on the residue class 7 mod m, 
(111) when regarded as a function on the residue classes modulo m, x is a 
multiplicative character of (Z/mZ)”. 


In particular, a Dirichlet character modulo m determines a multiplicative character 
of (Z/mZ)*. Conversely each multiplicative character of (Z/mZ)* defines a 
unique Dirichlet character modulo m as the lift of the multiplicative character on 
the set {7 € Z | GCD(j,m) = 1} and as 0 on the rest of Z. For example the 
multiplicative character on (Z/4Z)* that is 1 at 1 mod 4 and is —1 at 3 mod 4 
lifts to the Dirichlet character that is 1 at integers congruent to 1 modulo 4, 
is —1 at integers congruent to 3 modulo 4, and is 0 at even integers. It will 
often be notationally helpful to use the same symbol for the Dirichlet character 
and the multiplicative character of (Z/mZ)*. Because of this correspondence, 
the number of Dirichlet characters modulo m matches the order of G for G = 
(Z/mZ)*, which matches the order of G and is g(m), where ¢ is the Euler g 
function. The principal Dirichlet character modulo m, denoted by xo, is the one 
built from the trivial character of (Z/mZ)*: 


1 if GCD(j, m) = 1, 


xo) = fe if GCD(j, m) > 1. 


10. Dirichlet’s Theorem on Primes in Arithmetic Progressions 63 


Each Dirichlet character modulo m is strictly multiplicative, in the sense of 
the previous section. We assemble each as the coefficients of a Dirichlet series, 
the associated Dirichlet L function, by the definition 


(ee) 


L6.0=>- ae 


n=1 


Proposition 1.28. Fix m, and let x be a Dirichlet character modulo m. 


(a) The Dirichlet series L(s, x) is absolutely convergent for Res > 1 and is 
given in that region by a first-degree Euler product 


L¢s,x)= |] 


pues CDP 


(b) If x is not principal, then the series for L(s, x) is convergent for Res > 0, 
and the sum is analytic for Res > 0. 

(c) For the principal Dirichlet character x9 modulo m, L(s, x9) extends to be 
meromorphic for Res > 0. Its only pole for Res > 0 is at s = 1, and the pole is 
simple. It is given in terms of the Riemann zeta function by 


L(s, x0) =¢(s) J] Gp"). 
Pp prime, 
p dividing m 
PROOF. For (a), the boundedness of x implies that the series is absolutely 
convergent for Res > 1. Since x is strictly multiplicative, L(s, x) has a first- 
degree Euler product by Proposition 1.26, and the product is convergent in the 
same region. 
For (b), let us notice that xy ~ xo implies the equality 
m 
xn+b)=0 for any b, (*) 
n=1 
since the member of (Z/mZ)* that corresponds to x is orthogonal to the trivial 
character, by the Fourier inversion formula as quoted above from Basic Algebra. 
For s real and positive, let us write 
x) = x(n)- + = UnUn 
in the notation of the summation by parts formula that follows the statement of 
Proposition 1.23, and let us put U, = >°¢_, ux. Equation (*) implies that {U,} 
is bounded, say with |U,,| < C. Summation by parts then gives 


N-1 
x) 1 l Gs Anis, ECR 2G 
=e (4 | bas toe = 


64 I. Transition to Modern Number Theory 


This expression tends to 0 as M and N tend to oo. Therefore the series L(s, x) = 
bare x0) is convergent for s real and positive. By Proposition 1.23a the series 
is convergent for Res > 0, and the sum is analytic in this region. 

For (c), let Res > 1. From the product formula in (a) with x set equal to xo, 


we have 


L(s,xo)= TI y= 
Pp prime, 
p not dividing m 
Using the Euler product expansion of ¢(s), we obtain the displayed formula of (c). 
The remaining statements in (c) follow from Proposition 1.24, since the product 
over primes p not dividing m is a finite product. 


By Proposition 1.28b, L(s, x) is well defined and finite at s = 1 if x is not 
principal. The main step in the proof of Dirichlet’s Theorem is the following 
lemma. 


Lemma 1.29. L(1, x) 4 0 if x is not principal. 


PROOF. Let Z(s) = iar L(s, x). Exactly one factor of Z(s) has a pole at 
s = 1, according to Proposition 1.28. If any factor has a zero at s = 1, then Z(s) 
is analytic for Res > 0. Assuming that Z(s) is indeed analytic, we shall derive 
a contradiction. 

Being the finite product of absolutely convergent Dirichlet series for Res > 1, 
Z(s) is given by an absolutely convergent Dirichlet series. We shall prove that 
the coefficients of this series are > 0. More precisely we shall prove forRes > 1 


that ; 
Z(s) = 


(*) 


~ (p)’ 
p with GCD(p,m)=1 (1 —p f(p)s)$ P 


where f (p) is the order of p in (Z/mZ)* and where g(p) = g(m)/f (p), g being 
Euler’s gy function. The factor (1 — p-/*)~! is given by a Dirichlet series with 
all coefficients > 0. Hence so is the g(p)" power, and so is the product over p 
of the result. Thus () will prove that all coefficients of Z(s) are > 0. 

To prove (*), we write, for Res > 1, 


20) =[] 46.0 =] (Tl iq) = IT (lero) 


p with 
GCD(p,m)=1 


Fix p not dividing m. We shall show that 


[] @-x@p*) =(-p "yy, (%%) 


x 


10. Dirichlet’s Theorem on Primes in Arithmetic Progressions 65 


where f is the order of p in (Z/mZ)* and where g = g(m)/f; then (*) will 
follow. 

The function x — x(p) is ahomomorphism of (Z/mZ)* into the subgroup 
{e27'k/I) of §' and is onto some cyclic subgroup {e?7!*/f ‘) with f’ dividing 
f. Let us see that f’ = f. In fact, if f’ < f, then p/ # 1 mod ™m, while 
x(pr) — x(pyt = | for all x; since x(pr) = x (1) for all x, the x’s cannot 
span all functions on (Z/mZ)*, in contradiction to the Fourier inversion formula 
(Theorem 7.17 of Basic Algebra). 

Thus x — x(p) is onto {e?7'*/F}. In other words, x (p) takes on all f roots 
of unity as values, and the homomorphism property ensures that each is taken on 
the same number of times, namely g = y(m)/f times. If X is an indeterminate, 
we then have 


f-l 
[la -xix = (T]a- 2/9)" =a = xe. 
k=0 


x 


Then («*) follows and so does (*). Hence all the coefficients of the Dirichlet 
series of Z(s) are > 0. We have already observed that this series, as the finite 
product of absolutely convergent series for Res > 1, is absolutely convergent for 
Res > 1. Thus Proposition 1.25 applies and shows that the Dirichlet series of 
Z(s) converges for Res > 0. 

Since the coefficients of the series are positive, the convergence is absolute 
for s real and positive. By Proposition 1.23b the convergence is absolute for 
Res > 0. Therefore the Euler product expansion (+) is valid for Res > 0. 

For primes p not dividing m and for real s > 0, we have 


1 
——— perder f+ ph pf 1+ ple + pO +... 
(heap) 
1 
= —y(m)s —29(m)s 4. 
=l+p + p = To pawns 


In combination with (*), this inequality gives 


zo 1] 7a) 


p dividing m 


—( Fo oo 1 othe) 


— p-fsyg 
eMiECO MIO Le: IR? Se dainen 


: Iie 2, OR A 
= I] 1— p-ems = pe ngin)s * 


p prime 


The sum on the right is +00 for s = 1/g(m), while the left side is finite for that 
s. This contradiction completes the proof of the lemma. 


66 I. Transition to Modern Number Theory 


PROOF OF THEOREM 1.21. First we show for each Dirichlet character xy modulo 
m that 
log Lis, x)= LD P+ als, x) (*) 
p prime 
for real numbers s > 1, with g(s, x) remaining bounded as s | 1. In this 
statement we have not yet specified a branch of the logarithm, and we shall 
choose it presently. Fix p and define, for s > 1, a value of the logarithm of the 
p" factor of the Euler product of L(s, x) in Proposition 1.28a by 


— 10) | 1x@?) , 1x?) — 1) | 
JS ps 


log (SImp>) = P 2 p> 3 p> Pot Se ee als, DP; x). (*) 


In Section 8 we obtained the inequality | log(1 — x)~! — x| < 2|x|* for real x 
with |x| < 7 but the proof remains valid for complex x with |x| < 5 Since 


x = x(p)p © is complex with |x(p)p~*| < s, we obtain 


les, p, x)| = log (Gps) — X(P)P™| S 2Ix(P)P FP < 2p. 


Since }°, prime PP” < Dopzi > < 00, the series °,, g(s, p, x) is uniformly 
convergent for s > 1. Let g(s, x) be the continuous function )°> . a(s, DP, X). 
Summing (**) over primes p, we obtain 


Dlog (yor) = 2h up + g(s, X). 
P P 


Because of the validity of the Euler product expansion of L(s, x) in Proposition 
1.28a, the left side represents a branch of log L(s, x). This proves (+). 
For each b prime to m, define a function F;, on the positive integers by 


1 ifn =bmodm, 
0 otherwise. 


F,(n) = 
The Fourier inversion formula (Theorem 7.17 of Basic Algebra) gives 


> x(b)x(n) = gm) Fy(n). (+) 
Xx 


Multiplying (*) by x (b), summing on x, and using (+) to handle the term that is 
summed over p prime, we obtain 


gim) YY pt = xO)log Los, x)— Vx )as, x). (+) 
x x 


eer) 
The term >> 4 x (b)g(s, x) is bounded as s | 1, according to (*). The term 
Xo(b) log L(s, Xo) is unbounded as s | 1, by Proposition 1.28c. For x nonprin- 
cipal, the term x (b) log L(s, x) is bounded as s | 1, by Proposition 1.28b and 
Lemma 1.29. Therefore the left side of ({ 7) is unbounded as s | 1. Hence the 
number of primes contributing to the sum is infinite. 


11, Problems 67 


11. Problems 


Fix an odd integer m > 1. Let P be the set of odd primes p > 0 such that 
x? = m mod p is solvable and such that p does not divide m. Show that P is 
nonempty and that there is a finite set S of arithmetic progressions such that the 
members of P are the odd primes > 0 that lie in at least one member of S. 


Let D be a nonsquare integer, and let m be an odd integer with GCD(D, m) = 1. 

By suitably adapting the proof of Theorem 1.6, 

(a) prove that if m is primitively representable by some binary quadratic form 
of discriminant D, then x2 = D mod m is solvable, 

(b) prove that if x* = D mod m is solvable and m is odd, then m is primitively 
representable by some binary quadratic form of discriminant D. 


For a fixed discriminant D, let H be the group of proper equivalence classes 
of binary quadratic forms of discriminant D, and let H’ be the set of ordinary 
equivalence classes of discriminant D. Inclusion of a proper equivalence class 
into the ordinary equivalence class that contains it gives a map f of H onto H’. 
Give an example in which H’ can admit no group structure for which f is a group 
homomorphism. 


(a) Show that if (a, b, c) has order 3 in the form class group, then the product 
of any two integers of the form ax* + bxy + cy? is again of that form. 

(b) Show that h(—23) = 3. 

(c) Using the general theory, show that the class of 2x” + xy + 3y* has order 3. 

(d) Find an explicit formula for (X, Y) in terms of (41, y;) and (x2, yo) such 
that (2x? + x1y1 + 3y7)(2x5 + x2y2 + 3y5) = 2X? + XY + 3Y?. 


If two integer forms are improperly equivalent over Z, prove that they are properly 
equivalent over Q. 


Verify for the fundamental discriminant D = —67 that h(D) = 1. (Edu- 
cational note: It is known that the only negative fundamental discriminants 
D with h(D) = 1 are —3, —4, —7, —8, —11, —19, —43, —67, —163. It is 
known also that the only other nonsquare D < 0 for which h(D) = | are 
—12, —16, —28, —27.) 


This problem carries out the algorithm suggested by Theorem 1.8 to find repre- 
sentatives of all proper equivalence classes of binary quadratic forms (a, b, c) of 
discriminant 316 = 4 - 79. For each of these, b will be even. 

(a) Foreach even positive b withb < \/4 - 79, factor (b? — 4-79)/4 as a product 
ac in all possible ways such that a > 0 and such that both |a| and |c| lie 
between /79 — b/2 and /79 + b/2, obtaining 16 forms (a, b, c). Expand 
the list by adjoining each form (—a, b, —c), so that the expanded list has 32 
members. 


68 I. Transition to Modern Number Theory 


(b) Arrange the 32 members of the expanded list of (a) into 6 cycles, obtaining 
2 cycles of length 4 and 4 cycles of length 6. 
(c) Conclude that h(4 - 79) = 6. 


8. For discriminant D = —47, the class number is h(—47) = 5, and the reduced 
binary quadratic forms are (1, 1, 12), (2, 1, 6), (2, —1, 6), (3, 1,4), G, —1, 4). 
Show what the multiplication table is for the proper equivalence classes of these 
forms. 


Problems 9-11 concern the Jacobi symbol, which is a generalization of the Legendre 
symbol. Let m and n be integers with n > 0 odd, and letn = pi ee pe be the prime 
factorization of n. The Jacobi symbol (“) is defined to be 0 if GCD(m, n) > 1 and is 


defined to be TTj=1 (a) "i ig GCD(m, n) = 1, where (5) is a Legendre symbol. The 
Jacobi symbol therefore extends the domain of the Legendre symbol, and it depends 
only on the residue m mod n. Even when GCD(m, n) = 1, the Jacobi symbol does 
not encode whether m is a square modulo n, however, since G) = +1 and since the 


residue —1 is not a square modulo 21. 


9. Suppose that n and n’ are odd positive integers and that m and m’ are integers. 
Verify that 
@ @) = @@). 
(b) (#) = () =1 if GCD(™m, n) = 1. 


n 


10. Prove for all odd positive integers n that 
@ B=Cp7, 
(bo) Q=(-Ia-Y, 

11. (Quadratic reciprocity) Prove for all odd positive integers m and n satisfying 
GCD(n, n) = 1 that (#) = (DE @-DILe—DI(a), 

Problems 12-13 indicate, without spelling out what the group G is, two uses of 


Dirichlet’s Theorem in the subject of “elliptic curves.” No knowledge of the subject 
of elliptic curves is assumed, however. 


12. Suppose that G is a finite abelian group whose order |G| divides p + 1 for all 

sufficiently large primes p with p = 3 mod 4. It is to be shown that |G| divides 

4 by means of multiple applications of Dirichlet’s Theorem. 

(a) Deduce that 8 does not divide |G| by considering the arithmetic progression 
8k + 3. 

(b) Deduce that 3 does not divide |G| by considering the arithmetic progression 
12k +7. 

(c) Deduce that no odd prime gq > 3 divides |G| by considering the arithmetic 
progression 4qk + 3. 


13. 


11, Problems 69 


Suppose that G is a finite abelian group whose order |G| divides p + 1 for all 

sufficiently large primes p with p = 2 mod 3. It is to be shown that |G| divides 

6 by means of multiple applications of Dirichlet’s Theorem. 

(a) Deduce that 4 does not divide |G| by considering the arithmetic progression 
12k +5. 

(b) Deduce that 9 does not divide |G| by considering the arithmetic progression 
9k + 2. 

(c) Deduce that no odd prime g > 3 divides |G| by considering the arithmetic 
progression 3qk + 2. 


Problems 14-19 develop some elementary properties of ideals and their norms in 
quadratic number fields. Notation is as in Sections 6—7. In particular, the number 
field is K = Q(./m), the ring R of algebraic integers in it has Z basis {1, 5}, and o 
is the nontrivial automorphism of K fixing Q. 


14. 


15. 


16. 


17. 


18. 


19. 


Prove that if J = (a,r) is a nonzero ideal in R witha € Zandr ¢€ R, thena 
divides N(s) for every s in I. 


Prove that any nonzero ideal J in R can be written as J = (a,b + gd) with a, 
b, and g in Z and witha > 0,0 < b < a, and0O < g <a. Prove also that the 
Z basis with these properties is unique, and it has the properties that g divides a 
and b and that ag divides N(b + g6). 

Let a, b, and g be integers satisfying a > 0,0 < b < a,andO0 <g <a 
with g dividing a and b and with ag dividing N(b + gd). Prove that the ideal 
I = (a,b + gS) in R has {a, b + gd} as a Z basis. 

Prove that if J = (a,r) is anonzero ideal in R witha € Z,r € R,andr =c+dé6 
for integers c and d, then N(J) = |ad|. 


(a) Prove that if 7 is a nonzero ideal in R, then N(J/) is the number of elements 
in R/T. 

(b) Deduce that if J C J are nonzero ideals in R, then N(J) divides N(/), and 
I = J ifand only if N(J) = NU). 


(a) Using the Chinese Remainder Theorem, prove that if J and J are nonzero 
ideals in R with 7 + J = R, then NU J) = NUI)N(J). 

(b) Let P be a nonzero prime ideal in R, and let p > O be the prime number 
such that P Z = (p)Z. Then R/P is a vector space over Z/pZ, and its 
order is of the form p/ for some integer f > 0. Show by induction on the 
integer e > 0 that R/P° has order p*. 

(c) Using unique factorization of ideals, deduce that if J and J are any two 
nonzero ideals in R, then N(IJ) = NUI)N(J). 

(d) Prove that any nonzero ideal J of R has Io(J) = (N(J)). 


Problems 20-24 concern the splitting of prime ideals when extended to quadratic 
number fields. Fix a quadratic number field Q(./m ), and let R, D, 5, and o be as 


70 I. Transition to Modern Number Theory 


in Sections 6-7. Let p > 0 bea prime in Z. According to Theorem 9.62 of Basic 

Algebra, the unique factorization of the ideal (p)R in R is one of the following: 

(p)R = (p) is already prime in R, (p)R = P P2 is the product of two distinct prime 

ideals, or (p)R = P? is the square of a prime ideal. 

20. Deduce from the formula N((p)R) = p that if P is a nontrivial factor in the 
unique factorization of the ideal (p)R, then N(P) = p. 


21. This problem concerns the prime p = 2. 

(a) Use Problem 15 to prove that if D = 5 mod 8, then (2)R is a prime ideal 
in R. 

(b) Prove that if D = | mod 8, then (2)R factors into the product of two distinct 
prime factors as (2)R = (2, 6)(2, 1+). 

(c) Prove thatif Dis even and D/4 = 3 mod 4, then (2)R = (2, | +6)? exhibits 
(2)R as the square of a prime ideal. 

(d) Prove that if D is even and D/4 = 2 mod 4, then (2)R = (2, 5)? exhibits 
(2)R as the square of a prime ideal. 


22. Let p be an odd prime. 

(a) Prove that if D is odd, then (p)R has a nontrivial factorization into prime 
ideals if and only if x? + x + x1 — D) = 0 mod p has a solution, and in 
this case a factorization of (p)R is as (p)R = (p,x + 46)(p, x +0(6)). 

(b) Prove that if D is even, then (p)R has a nontrivial factorization into prime 
ideals if and only if x7 = 0 mod (D/4) has a solution, and in this case a 
factorization of (p)R is as (p)R = (p,x +4)(p,x +0(6)). 

(c) Deduce from (a) and (b) that (p)R has a nontrivial factorization into prime 
ideals if and only if D is a square modulo p. 


23. Let p be an odd prime such that D is a square modulo p, so that Problem 22c 
gives a nontrivial factorization of (p)R into prime ideals of the form (p)R = 
(p,x + 4)(p, x +0 (6)) for some integer x. Let 7 = (p,x +). 

(a) Prove that if D is odd, then o (J) = J if and only if the integer x is 5(p —1). 
(b) Prove that if D is even, then o (J) = J if and only if the integer x is 0. 


24. Let p be an odd prime such that D is a square modulo p, so that Problem 22c 
gives a nontrivial factorization of (p)R into prime ideals of the form (p)R = 
(p, x +4)(p, x + .0(6)) for some integer x. Using the previous problem, show 
that the two factors on the right are the same ideal if and only if p divides D. 


Problems 25-29 seek to identify the genus group explicitly for fundamental discrim- 
inants D. Let K = Q(./m) be the corresponding quadratic number field, let R be 
the ring of algebraic integers in K, and let o be the nontrivial automorphism of K 
fixing Q. Let EF = {pj,..., Pg+i} with g > 0 be the set of distinct prime divisors 
of D. The goal of this set of problems is to prove that the order of the genus group 
is 28 and to exhibit ideals in R representing each genus. Recall from Theorem 1.20 


11, Problems 71 


that strict equivalence classes of ideals correspond to proper equivalence classes of 
binary quadratic forms and therefore that each genus corresponds to a set of proper 
equivalence classes of binary quadratic forms. 


25. 


26. 


27. 


28. 


Let the form class group H for discriminant D be isomorphic to a product of cyclic 
groups of orders De Dkr, re ere qe, where kj,...,k, and lj,...,/; are 
positive integers and q1,..., qs are odd primes that are not necessarily distinct. 
Prove that the genus group has order 2” and is abstractly isomorphic to the 
subgroup of H of elements whose order divides 2. (Educational note: Thus a 
goal of the present set of problems is to show that r = g.) 


According to Problems 20-24, the nonzero prime ideals of R are of three kinds: 
(i) unique distinct ideals J = (p, b+6) ando (J) = (p, b+o(6)) with prod- 
uct (p)R if p is an odd prime not dividing D such that x7 = D mod p 
is solvable, or if p = 2 and D = 1 mod 8, 
(ii) the ideal (p)R if p is an odd prime not dividing D such that x7 = 
D mod p is not solvable, or if p = 2 and D=5 mod 8, 
(iii) a unique ideal J, with J F = (p)R if p divides D. 

For each subset S C E of the g + 1 distinct prime divisors of D, define J; = 

TTpes Ip. 

(a) Using unique factorization of ideals in R, show that any nonzero proper ideal 
Tin R with o (J) = J is of the form (a) Js for some a € Z and some subset 
SCE. 

(b) By considering norms of ideals, show that J uniquely determines S in (a). 


(a) The element x = —1 of K has N(x) = 1 and factors as x = o(y)y—! for 
the element y = ./m of K. For all other elements x of K with norm 1, 
verify the formula 


1+.x (1+ x)x (_+x)x — A+x)x 


l+to(x) (lto(x))x x+xo(x) 14x 


’ 


and explain why it shows that x is of the form a (y)y~! for some y 4 Oin K. 
(Educational note: This result is a special case of Hilbert’s Theorem 90, 
which is a theorem in the cohomology of groups and appears in Chapter III. 
The general theorem says for a finite Galois extension K/k with Galois 
group I that the cohomology H! of the group I’ with coefficients in the 
abelian group K™ is 0.) 

(b) Show that the element y in (a) can be taken to be in R and that all such y’s 
in R are Z multiples of one of them yo, which is unique up to a factor of —1. 


Let J be a nonzero ideal in R whose class in the ideal class group 1 has order 2, 
ie., an ideal J such that 1? = (x) for some element x € R. 
(a) Show that the element xN(/)~! of K has norm 1. 


72 


29. 


I. Transition to Modern Number Theory 


(b) Show that the corresponding element yo of R from the previous problem has 
the property that o ((yo)l) = (o)!. 

(c) Using either yo or yo./m from (b), deduce that for any nonzero ideal J in 
R with J? principal, there is a strictly equivalent ideal Js for some subset 
S C E of the g+1 prime divisors of E. Consequently the order of the genus 
group is a power of 2 equal to at most 28+!, 


This problem shows that the number of ideals Js in the previous problem that 
are mutually strictly inequivalent is exactly 2°. To get at this fact, the problem 
investigates properties of principal ideals J = (x) in R with the properties that 

o(1) = I and N(x) > 0. Since o (J) = J, it must be true that o(x) = ex for 

some unit ¢ in R, and then N(o(x)) = N(x) implies that N(¢) = +1. Matters 

now split into cases along the lines of the hypotheses of Proposition 1.17. 

(a) Under the assumption that m < 0 and that m is neither —1 nor —3, show that 
if a principal ideal J = (x) in R has o(/) = J, then x is in Z or in Z,/m. 

(b) Under the assumption that m < 0, show that the only subsets S of E for 
which the ideal Js is principal are $ = @ and S equal to the set of all 
prime divisors of m, i.e., S equal to E for D odd and for D even with 
D/4 = 2 mod 4 and S equal to E — {2} for D even with D/4 = 2 mod 4. 

(c) Under the assumption that m < 0, Proposition 1.17 says that strict equiv- 

alence for ideals coincides with equivalence. Show how to conclude from 

this fact and the results of (a) and (b) that the order of the genus group is 28 

when m < 0. 

Under the assumption that m > 0 and that the fundamental unit ¢; has norm 

—1, Proposition 1.17 says that strict equivalence for ideals coincides with 

equivalence. With J, x, and € as in the statement of the problem, show that 

B= e2n for some integer n > 0. Deduce that o (ex) = sex fora suitable 

choice of sign s, and show as a consequence that J is principal for the same 

S’s as in (b) and that the order of the genus group is 2°. 

(e) Under the assumption that m > 0 and that the fundamental unit ¢; has norm 
+1, Proposition 1.17 says that strict equivalence for ideals is distinct from 
equivalence; in particular, there are two strict equivalence classes of principal 
ideals: those with a generator of positive norm and those with a generator of 
negative norm. Let veg and yg be the elements produced by Problem 27 that 
satisfy ¢) = a(yg Kg)! and —e; = ay Og): Prove that exactly 
one of yo. and yp has positive norm, so that two of the principal ideals (1), 
( Va ), (Yo )s (./m ) are strictly equivalent to (1), and two are not. Prove that 
all four of these principal ideals are of the form Js and that they are distinct. 
By expressing elements arising from Problem 27 for the most general unit in 
R in terms of yo and ¢), show that no other Js is a principal ideal. Show as 
a consequence that the number of strict equivalence classes of ideals among 
the Js’s is 28. 


(d 


wm 


11, Problems 73 


Problems 30-34 show that proper equivalence over Q for two integer forms of 
fundamental discriminant D implies proper equivalence over Z/ DZ. Consequently 
the order of the genus group is at most the number of classes of integer forms of 
discriminant D under proper equivalence over Z/DZ. It will follow from the next 
set of problems, concerning “genus characters,” that the number of such classes is at 
least 28, where g + 1 is the number of distinct prime divisors of D. In combination 
with Problem 29, this result shows that the number of genera equals 2°. Throughout 
this set of problems, let D be a fundamental discriminant. 


30. 


31. 


32: 


33. 


Let (a, b1, cj) be a binary quadratic form over Z of discriminant D. Using 
Lemma 1.10, prove that (a), 5), c1) is properly equivalent over Z to a form 
(a, b, c) of discriminant D such that GCD(a, D) = 1. 


Suppose that (a, b, c) is a binary quadratic form over Z of discriminant D such 

that GCD(a, D) = 1. 

(a) Prove that if D is odd, then (a, b, c) is properly equivalent over Z to a form 
(a, kD,1D) for some integers k and /. 

(b) Prove that if D is even, then (a, b, c) is properly equivalent over Z to a form 
(a, 2kD, —a(D/4) + 1D) for some integers k and /. 


Suppose that (a, kD, 1D) is a form over Z having odd discriminant D, satisfying 
GCD(a, D) = 1, and taking on an integer value r relatively prime to D for some 
rational (x, y). Write x and y as fractions with a positive common denominator 
as small as possible: x = u/w and y = v/w. 

(a) Prove that GCD(w, D) = 1, and conclude that a = d?r mod D for some 
integer d relatively prime to D. 

(b) Suppose that (a’, k’D, I'D) is a second form over Z having discriminant D, 
satisfying GCD(a’, D) = 1, and taking on the value r at some rational point. 
Prove that a’ = as* mod D for some s relatively prime to D. 

(c) Suppose that (a, b,c) and (a’, b’,c’) are forms over Z of the same odd 
discriminant with GCD(a, D) = GCD(a’, D) = 1, and suppose that these 
forms are properly equivalent over Q. Deduce that (a, b,c) and (a’, b’, c’) 
are properly equivalent over Z/DZ in the sense that there exists a matrix 


( 7 in SL, Z/DZ) such that substitution of x = ax’ + By’ and y = 


yx’ + by’ leads from ax? + bxy + cy? modulo D to a'x’? + b'x'y’ + cly”? 


modulo D. 


Suppose that (a, 2kD, —a(D/4)+/D) is a form over Z having even discriminant 

D, satisfying GCD(a, D) = 1, and taking on an integer value r relatively prime 

to D for some rational (x, y). Write x and y as fractions with a positive common 

denominator as small as possible: x = u/w and y = v/w. 

(a) Prove that GCD(w, D) = 1, and obtain a congruence relating a and r 
modulo D. 


74 I. Transition to Modern Number Theory 


(b) Suppose that (a’, 2k’D, —a'(D/4) + I'D) is a second form over Z hav- 
ing discriminant D, satisfying GCD(a’, D) = 1, and taking on the value 
r at some rational point. Prove that G) = () for every odd prime p 
dividing D. 

(c) In the setting of (b), suppose in addition that D/4 = 3 mod 4. Prove that 
a=a' mod 4. 

(d) In the setting of (b), suppose in addition that D/4 = 2 mod 4. Prove for 
D/4 = 2 mod 8 that a’ = +a mod 8, and prove for D/4 = 6 mod 8 that 
either a’ = a mod 8 or a’ = 3a mod 8. 

(e) Suppose that (a, b,c) and (a’, b’, c') are forms over Z of the same even 
discriminant with GCD(a, D) = GCD(a’, D) = 1, and suppose that these 
forms are properly equivalent over Q. Deduce that (a, b, c) and (a’, b’, c’) 
are properly equivalent over Z/ DZ. 


34. Why does it follow from Problems 30-33 that the order of the genus group for 
discriminant D is at least as large as the number of proper equivalence classes 
under SL(2, Z/ DZ) of integer forms of discriminant D? 


Problems 35—40 introduce “‘genus characters.” In fact, genus characters are already 
implicit in Problems 32 and 33. Throughout this set of problems, let D be a fun- 
damental discriminant, and suppose that D has exactly g + | distinct prime factors. 
The content of these problems will be summarized in Problem 40. Call two binary 
quadratic forms over Z of discriminant D similar modulo D if they take on the same 
residues r modulo D that are relatively prime to D. Proper equivalence over Z via 
SL(2, Z) implies proper equivalence modulo D via SL, Z/DZ), and this in turn 
implies similarity modulo D in the sense that was just defined. Problems 30-31 show 
that it is enough to study forms ax” mod D for D odd, where GCD(a, D) = 1, and 
to study forms a(x? — (D /4) y?) for D even, again where GCD(a, D) = 1. Initially 
the genus characters are functions of pairs (similarity class, 7), where r is a residue 
modulo D with GCD(r, D) = 1 such that r is represented by the form modulo D. 
The values of these functions are (5) for each odd prime p > 0 dividing D, as well 
as the indicated one of the following for p = 2 if D is even: 


g(r) = (2) = (2) if D is even and D/4 = 3 mod 4, 
nr) = Q = (—1)8-D if D is even and D/4 = 2 mod 8, 


: 
E(r)n(r) = () = (—1)2"-D+8@-D if D is even and D/4 = 6 mod 8. 


Thus g + | expressions have been defined for each ordered pair (similarity class, r). 


35. Using Problems 32 and 33, show that the genus characters are independent of the 
residue r modulo D with GCD(r, D) = 1 such that r is represented by the form 
modulo D. Therefore the residue a in the quadratic form, either ax? mod D for 
D odd or a(x” —(D /4) y?) for D even, can be used as r, and the genus characters 
are g + | functions defined on the set of similarity classes modulo D. 


36. 


3h: 


38. 


39. 


40. 


11. Problems 75 


Prove that the genus characters respect the operation of multiplication of proper 
equivalence classes of forms over Z. 


The product of all g + 1 genus characters is | in every case. A sketch of the 
argument for D odd is as follows: Since D = | mod 4, D has an even number 
2t of prime factors 4k + 3. Use of the Jacobi symbol with a odd and p varying 
over the (odd) prime divisors of D gives 


NO= 1 @ TT @=s@”* TW @ W @=(@), 


a 
p poaest panes pak 9” p=ak-+3 


and the right side is +1 by Problem 2a. Using this sketch as a guide, show that 
the product of all g + 1 genus characters is 1 for the cases that D is even and 
(a) D/4 =3 mod 4, 

(b) D/4 =2 mod 8, 

(c) D/4=6 mod 8. 

If D is even, let a be € if D/4 = 3 mod4, n if D/4 = 2 mod 8, and &n 
if D/4 = 6mod8. Let p + Sy» be any function to {+1} from the set of 
distinct prime divisors of D. Using Dirichlet’s Theorem on primes in arithmetic 
progressions, prove that there exists a prime qg such that (5) = Sp for each odd 


prime divisor p of D and a(q) = s2 in case D is even. 


With a as in the previous problem, let p +> sy be any function to {+1} from the 
set of distinct prime divisors of D such that [] psp = +1, and choose a prime 
q as in the previous problem. Prove that q is primitively representable by some 
integer binary quadratic form of discriminant D and that the values of the genus 
characters on this form are the numbers s,. Conclude that the number of distinct 
similarity classes modulo D is at least 28. 


For the quadratic number field K = Q(./m ) with discriminant D, suppose that 
D has g + 1 distinct prime divisors. Conclude that the following equivalence 
classes of binary quadratic forms over Z of discriminant D coincide and that the 
number of such classes is 28: 
(i) classes relative to proper equivalence over Q, i.e., genera, 
(ii) classes relative to proper equivalence over Z/ DZ, 
(iii) classes relative to similarity modulo D. 


CHAPTER II 


Wedderburn—Artin Ring Theory 


Abstract. This chapter studies finite-dimensional associative division algebras, as well as other 
finite-dimensional associative algebras and closely related rings. The chapter is in two parts that 
overlap slightly in Section 6. The first part gives the structure theory of the rings in question, and 
the second part aims at understanding limitations imposed by the structure of a division ring. 

Section | briefly summarizes the structure theory for finite-dimensional (nonassociative) Lie 
algebras that was the primary historical motivation for structure theory in the associative case. All 
the algebras in this chapter except those explicitly called Lie algebras are understood to be associative. 

Section 2 introduces left semisimple rings, defined as rings R with identity such that the left 
R module R is semisimple. Wedderburn’s Theorem says that such a ring is the finite product of 
full matrix rings over division rings. The number of factors, the size of each matrix ring, and the 
isomorphism class of each division ring are uniquely determined. It follows that left semisimple 
and right semisimple are the same. If the ring is a finite-dimensional algebra over a field F’, then the 
various division rings are finite-dimensional division algebras over fF’. The factors of semisimple 
rings are simple, i.e., are nonzero and have no nontrivial two-sided ideals, but an example is given 
to show that a simple ring need not be semisimple. Every finite-dimensional simple algebra is 
semisimple. 

Section 3 introduces chain conditions into the discussion as a useful generalization of finite 
dimensionality. A ring R with identity is left Artinian if the left ideals of the ring satisfy the 
descending chain condition. Artin’s Theorem for simple rings is that left Artinian is equivalent to 
semisimplicity, hence to the condition that the given ring be a full matrix ring over a division ring. 

Sections 4-6 concern what happens when the assumption of semisimplicity is dropped but some 
finiteness condition is maintained. Section 4 introduces the Wedderburn—Artin radical rad R of a 
left Artinian ring R as the sum of all nilpotent left ideals. The radical is a two-sided nilpotent ideal. 
It is 0 if and only if the ring is semisimple. More generally R/rad R is always semisimple if R is 
left Artinian. Sections 5—6 state and prove Wedderburn’s Main Theorem — that a finite-dimensional 
algebra R with identity over a field F of characteristic 0 has a semisimple subalgebra S such that R 
is isomorphic as a vector space to S @ rad R. The semisimple algebra S is isomorphic to R/ rad R. 
Section 5 gives the hard part of the proof, which handles the special case that R/ rad R is isomorphic 
to a product of full matrix algebras over F. The remainder of the proof, which appears in Section 6, 
follows relatively quickly from the special case in Section 5 and an investigation of circumstances 
under which the tensor product over F of two semisimple algebras is semisimple. Such a tensor 
product is not always semisimple, but it is semisimple in characteristic 0. 

The results about tensor products in Section 6, but with other hypotheses in place of the condition 
of characteristic 0, play a role in the remainder of the chapter, which is aimed at identifying certain 
division rings. Sections 7—8 provide general tools. Section 7 begins with further results about tensor 
products. Then the Skolem—Noether Theorem gives a relationship between any two homomorphisms 
of a simple subalgebra into a simple algebra whose center coincides with the underlying field of 


716 


1. Historical Motivation 77 


scalars. Section 8 proves the Double Centralizer Theorem, which says for this situation that the 
centralizer of the simple subalgebra in the whole algebra is simple and that the product of the 
dimensions of the subalgebra and the centralizer is the dimension of the whole algebra. 

Sections 9-10 apply the results of Sections 6-8 to obtain two celebrated theorems — Wedderburn’s 
Theorem about finite division rings and Frobenius’s Theorem classifying the finite-dimensional 
associative division algebras over the reals. 


1. Historical Motivation 


Elementary ring theory came from several sources historically and was already in 
place by 1880. Some of the sources are field theory (studied by Galois and others), 
rings of algebraic integers (studied by Gauss, Dirichlet, Kummer, Kronecker, 
Dedekind, and others), and matrices (studied by Cayley, Hamilton, and others). 
More advanced general ring theory arose initially not on its own but as an effort 
to imitate the theory of “Lie algebras,” which began about 1880. 

A brief summary of some early theorems about Lie algebras will put matters 
in perspective. The term “algebra” in connection with a field F refers at least to 
an F vector space with a multiplication that is F bilinear. This chapter will deal 
only with two kinds of such algebras, the Lie algebras and those algebras whose 
multiplication is associative. If the modifier “Lie” is absent, the understanding is 
that the algebra is associative. 

Lie algebras arose originally from “Lie groups”—which we can regard for 
current purposes as connected groups with finitely many smooth parameters — 
by a process of taking derivatives along curves at the identity element of the 
group. Precise knowledge of that process will be unnecessary in our treatment, 
but we describe one example: The vector space M,, (R) of all n-by-n matrices over 
R becomes a Lie algebra with multiplication defined by the “bracket product” 
[X,Y] = XY — YX. If G is a closed subgroup of the matrix group GL(n, R) 
and g is the set of all members of M,,(R) of the form X = c’(0), where c is a 
smooth curve in G with c(O) equal to the identity, then it turns out that the vector 
space g is closed under the bracket product and is a Lie algebra. Although one 
might expect the Lie algebra g to give information about the Lie group G only 
infinitesimally at the identity, it turns out that g determines the multiplication rule 
for G in a whole open neighborhood of the identity. Thus the Lie group and Lie 
algebra are much more closely related than one might at first expect. 


We turn to the underlying definitions and early main theorems about Lie alge- 
bras. Let F be a field. A vector space A over F with an F bilinear multiplication 
(X, Y) > [X, Y]is a Lie algebra if the multiplication has the two properties 

(i) [X, X] = 0 forall X € A, 
(ii) (Jacobi identity) LX, [Y, Z]] + [Y, [Z, X]] + [Z, [X, Y]] = 0 for all 
X,Y,ZEA. 


78 I. Wedderburn-Artin Ring Theory 


Multiplication is often referred to as bracket. It is usually not associative. The 
vector space M,,(F’) with [X,Y] = XY — YX is a Lie algebra, as one easily 
checks by expanding out the various brackets that are involved; it is denoted by 
gl(n, F). 

The elementary structural definitions with Lie algebras run parallel to those 
with rings. A Lie subalgebra S of A is a vector subspace closed under brackets, 
an ideal J of A is a vector subspace such that [X, Y]isin/ for X € J and Y € A, 
a homomorphism ¢ : A; — Ap of Lie algebras is a linear mapping respecting 
brackets in the sense that g[X, Y] = [y(X), p(Y)] for all X, Y € Aj, and an 
isomorphism is an invertible homomorphism. Every ideal is a Lie subalgebra. 
In contrast to the case of rings, there is no distinction between “left ideals” and 
“right ideals” because the bracket product is skew symmetric. Under the passage 
from Lie groups to Lie algebras, abelian Lie groups yield Lie algebras with all 
brackets 0, and thus one says that a Lie algebra is abelian if all its brackets are 0. 

Examples of Lie subalgebras of gl(n, F) are the subalgebra sl(n, F) of all 
matrices of trace 0, the subalgebra so(n, F’) of all skew-symmetric matrices, and 
the subalgebra of all upper-triangular matrices. 

The elementary properties of subalgebras, homomorphisms, and so on for Lie 
algebras mimic what is true for rings: The kernel of a homomorphism is an 
ideal. Any ideal is the kernel of a quotient homomorphism. If / is an ideal in 
A, then the ideals of A//J correspond to the ideals of A containing /, just as 
in the First Isomorphism Theorem for rings. If J and J are ideals in A, then 
(+ J)/1 = J/U0 J), just as in the Second Isomorphism Theorem for rings. 

The connection of Lie algebras to Lie groups makes one want to introduce 
definitions that lead toward classifying all Lie algebras that are finite-dimensional. 
We therefore assume for the remainder of this section that all Lie algebras under 
discussion are finite-dimensional over F. Some of the steps require conditions 
on F’, and we shall assume that F' has characteristic 0. 

Group theory already had a notion of “solvable group” from Galois, and this 
leads to the notion of solvable Lie algebra. In A, let [A, A] denote the linear span 
of all [X, Y] with X, Y € A; [A, A] is called the commutator ideal of A, and 
A/[A, A] is abelian. In fact, [A, A] is the smallest ideal J in A such that A/J 
is abelian. Starting from A, let us form successive commutator ideals. Thus put 
Ao = A, Aj = [Ao, Ao], .-. 5 An = [An—1, An—1], So that 


A=AjpD A, D°--:DA,D::-. 


The terms of this sequence are all the same from some point on, by finite dimen- 
sionality, and we say that A is solvable if the terms are ultimately 0. One easily 
checks that the sum J + J of two solvable ideals in A, i.e., the set of sums, is 
a solvable ideal. By finite dimensionality, there exists a unique largest solvable 
ideal. This is called the radical of A and is denoted by rad A. The Lie algebra 


1. Historical Motivation 719 


A is said to be semisimple if rad A = 0. It is easy to use the First Isomorphism 
Theorem to check that A/ rad A is always semisimple. 

In the direction of classifying Lie algebras, one might therefore want to see how 
all solvable Lie algebras can be constructed by successive extensions, identify 
all semisimple Lie algebras, and determine how a general Lie algebra can be 
constructed from a semisimple Lie algebra and a solvable Lie algebra by an 
extension. 

The first step in this direction historically concerned identifying semisimple 
Lie algebras. We say that the Lie algebra A is simple if dim A > 1 andif A 
contains no nonzero proper ideals. 

Working with the field C but in a way that applies to other fields of 
characteristic 0, W. Killing proved in 1888 that A is semisimple if and only 
if A is the (internal) direct sum of simple ideals. In this case the direct summands 
are unique, and the only ideals in A are the partial direct sums. 

This result is strikingly different from what happens for abelian Lie algebras, 
for which the theory reduces to the theory of vector spaces. A 2-dimensional 
vector space is the internal direct sum of two 1-dimensional subspaces in many 
ways. But Killing’s theorem says that the decomposition of semisimple Lie 
algebras into simple ideals is unique, not just unique up to some isomorphism. 

E. Cartan in his 1894 thesis classified the simple Lie algebras, up to isomor- 
phism, for the case that the field is C. The Lie algebras sl(n, C) for n > 2 and 
so(n, C) forn = 3 andn > 5 were in his list, and there were others. Killing had 
come close to this classification in his 1888 work, but he had made a number of 
errors in both his statements and his proofs. 

E. E. Levi in 1905 addressed the extension problem for obtaining all finite- 
dimensional Lie algebras over C from semisimple ones and solvable ones. His 
theorem is that for any Lie algebra A, there exists a subalgebra S isomorphic to 
A/rad A such that A = S @ rad A as vector spaces. In essence, this result says 
that the extension defining A is given by a semidirect product. 

The final theorem in this vein at this time in history was a 1914 result of Cartan 
classifying the simple Lie algebras when the field F is R. This classification is a 
good bit more complicated than the classification when F is C. 


With this background in mind, we can put into context the corresponding 
developments for associative algebras. Although others had done some earlier 
work, J. H. M. Wedderburn made the first big advance for associative algebras in 
1905. Wedderburn’s theory in a certain sense is more complicated than the theory 
for Lie algebras because left ideals in the associative case are not necessarily two- 
sided ideals. Let us sketch this theory. 

For the remainder of this section until the last paragraph, A will denote a finite- 
dimensional associative algebra over a field F of characteristic 0, possibly the 0 


80 I. Wedderburn-Artin Ring Theory 


algebra. We shall always assume that A has an identity. Although we shall make 
some definitions here, we shall repeat them later in the chapter at the appropriate 
times. For many results later in the chapter, the field F’ will not be assumed to be 
of characteristic 0. 

As in Chapter X of Basic Algebra, a unital left A module & is said to be simple 
if it is nonzero and it has no proper nonzero A submodules, semisimple if it is the 
sum (or equivalently the direct sum) of simple A submodules. The algebra A is 
semisimple if the left A module A is a semisimple module, 1.e., if A is the direct 
sum of simple left ideals; A is simple if it is nonzero and has no nontrivial two- 
sided ideals. In contrast to the setting of Lie algebras, we make no exception for 
the 1-dimensional case; this distinction is necessary and is continually responsible 
for subtle differences between the two theories. 

Wedderburn’s first theorem has two parts to it, the first one modeled on Killing’s 
theorem for Lie algebras and the second one modeled on Cartan’s thesis: 


(i) The algebra A is semisimple if and only if it is the (internal) direct sum 
of simple two-sided ideals. In this case the direct summands are unique, 
and the only two-sided ideals of A are the partial direct sums. 

(ii) The algebra A is simple if and only if A = M,,(D) for some integern > 1 
and some division algebra D over F’. In particular, if F is algebraically 
closed, then A = M,,(F) for some n. 


E. Artin generalized the Wedderburn theory to a suitable kind of “semisimple 
ring.” For part of the theory, he introduced a notion of “radical” for the associative 
case — the radical of a finite-dimensional associative algebra A being the sum of 
the “nilpotent” left ideals of A. Here a left ideal J is called nilpotent if /* = 0 
for some k. The radical rad A is a two-sided ideal, and A/ rad A is a semisimple 
ring. 

Wedderburn’s Main Theorem, proved later in time and definitely assuming 
characteristic 0, is an analog for associative algebras of Levi’s result about Lie 
algebras. The result for associative algebras is that A decomposes as a vector- 
space direct sum A = S @rad A, where S is a semisimple subalgebra isomorphic 
to A/rad A. 


The remaining structural question for finite-dimensional associative algebras 
is to say something about simple algebras when the field is not algebraically 
closed. Such a result may be regarded as an analog of the 1914 work by Cartan. 
In the associative case one then wants to know what the F isomorphism classes of 
finite-dimensional associative division algebras D are for a given field F. We now 
drop the assumption that the field F has characteristic 0. In asking this question, 
one does not want to repeat the theory of field extensions. Consequently one 
looks only for classes of division algebras whose center is F. If F is algebraically 
closed, the only such D is F itself, as we shall observe in more detail in Section 2. 


2. Semisimple Rings and Wedderburn’s Theorem 81 


If F is a finite field, one is led to another theorem of Wedderburn’s, saying that D 
has to be commutative and hence that D = F;; this theorem appears in Section 9. 
If F is R, one is led to a theorem of Frobenius saying that there are just two such 
D’s up to R isomorphism, namely R itself and the quaternions H; this theorem 
appears in Section 10. For a general field F, it turns out that the set of classes 
of finite-dimensional division algebras with center F forms an abelian group. 
The group is called the “Brauer group” of F’. Its multiplication is defined by the 
condition that the class of D, times Dy, is the class of a division algebra D3 such 
that D,; ®r Dy = M,,(D3) for some n; the inverse of the class of D is the class 
of the opposite algebra D°, and the identity is the class of F. The study of the 
Brauer group is postponed to Chapter III. This group has an interpretation in terms 
of cohomology of groups, and it has applications to algebraic number theory. 


2. Semisimple Rings and Wedderburn’s Theorem 


We now begin our detailed investigation of associative algebras over a field. In 
this section we shall address the first theorem of Wedderburn’s that is mentioned 
in the previous section. It has two parts, one dealing with semisimple algebras 
and one dealing with finite-dimensional simple algebras. The first part does not 
need the finite dimensionality as a hypothesis, and we begin with that one. 

Let R be a ring with identity. The ring R is left semisimple if the left R 
module R is a semisimple module, i.e., if R is the direct sum of minimal left 
ideals.' In this case R = @j-5 1; for some set S and suitable minimal left 
ideals J;. Since R has an identity, we can decompose the identity according to 
the direct sum as 1 = 1,;, +---+1;, for some finite subset {i,,...,i,} of S, 
where 1;, is the component of | in J;,. Multiplying by r € R on the left, we 
see that R C @y_, Ii,. Consequently R has to be a finite sum of minimal left 
ideals. A ring R with identity is right semisimple if the right R module R is a 
semisimple module. We shall see later in this section that left semisimple and 
right semisimple are equivalent. 


EXAMPLES OF SEMISIMPLE RINGS. 


(1) If D is a division ring, then we saw in Example 4 in Section X.1 of Basic 
Algebra that the ring R = M,,(D) is left semisimple in the sense of the above 
definition. Actually, that example showed more. It showed that R as a left R 
module is given by M,(D) = D" @---@ D", where each D” is a simple left R 
module and the j summand D” corresponds to the matrices whose only nonzero 
entries are in the j column. The left R module M,,(D) has a composition series 
whose terms are the partial sums of the n summands D”. If M is any simple 
left M,(D) module and if x # 0 is in M, then M = M,(D)x. If we set 
I ={r € M,(D) | rx = 0}, then J is a left ideal in M,(D) and M = M,(D)/TI 


'By convention, a “minimal left ideal” always means a “minimal nonzero left ideal.” 


82 I. Wedderburn-Artin Ring Theory 


as a left M,,(D) module. In other words, M is an irreducible quotient module 
of the left M,(D) module M,,(D). By the Jordan—Holder Theorem (Corollary 
10.7 of Basic Algebra), M occurs as a composition factor. Hence M = D" as 
a left M,(D) module. Hence every simple left M,(D) module is isomorphic to 
D". We shall use this style of argument repeatedly but will ordinarily include 
less detail. 

(2) If Ri,..., Rn are left semisimple rings, then the direct product R = 
| bee R; is left semisimple.” In fact, each minimal left ideal of R;, when included 
into R, is a minimal left ideal of R. Hence R is the sum of minimal left ideals 
and is left semisimple. By the same kind of argument as for Example 1, every 
simple left R module is isomorphic to one of these minimal left ideals. 


Lemma 2.1. Let D be a division ring, let R = M,(D), and let D” be the 
simple left R module of column vectors. Each member of D acts on D” by 
scalar multiplication on the right side, yielding a member of Endr(D"). In turn, 
Endr(D") is aring, and this identification therefore is an inclusion of the members 
of D into the right D module Endr(D”). The inclusion is in fact an isomorphism 
of rings: D° = Endr(D"), where D° is the opposite ring of D. 


PRooF. Let g : D — Endr(D”) be the function given by g(d)(v) = vd. 
Then y(dd’)(v) = v(dd') = (vd)d' = ¢(d')(vd) = g(d')(g(d)(v)). Since the 
order of multiplication in D is reversed by 9, ¢ is a ring homomorphism of D? 
into Endr(D"). It is one-one because D?® is a division ring and has no nontrivial 
two-sided ideals. To see that it is onto Endg(D”), let f be in Endrg(D”). Put 


1 d 
0 a2 . . . 
f|.|]=] =. J. Since f is an R module homomorphism, 
0 dn 
a aj -0 1 a| 0-0 1 
a2 2 -0 20-0 0 
an a, 0-0 0) an 0-0 0 
a 0 d ad a 
a0--0 ah aod an 
= |= = y(d) 
a, 0-0 dn a,d an 


Therefore g(d) = f, and ¢ is onto. 


?Some comment is appropriate about the notation R = TIj_, Ri and the terminology “direct 
product.” Indeed, []?_, Rj is a product in the sense of category theory within the category of rings 
or the category of rings with identity. Sometimes one views R alternatively as built from ” two-sided 
ideals, each corresponding to one of the n coordinates; in this case, one may say that R is the “direct 
sum” of these ideals. This direct sum is to be regarded as a direct sum of abelian groups, or perhaps 
vector spaces or R modules, but it is not a coproduct within the category of rings with identity. 


2. Semisimple Rings and Wedderburn’s Theorem 83 
Theorem 2.2 (Wedderburn). If R is any left semisimple ring, then 
R & My,(D1) x «++ x Mn, (Dr) 


for suitable division rings D,,..., D, and positive integers n,;,...,n,. Thenum- 
ber r is uniquely determined by R, and the ordered pairs (n;, D,),..., (n+, D;) 
are determined up to a permutation of {1,...,7} and an isomorphism of each 
Dj. There are exactly r mutually nonisomorphic simple left R modules, namely 
(D1)",..+, (Dr). 


PROOF. Write R as the direct sum of minimal left ideals, and then regroup 
the summands according to their R isomorphism type as R = Din nj Vj, where 
nj Vj; is the direct sum of n; submodules R isomorphic to V; and where V; ¥ V; 
fori ~ j. The isomorphism is one of unital left R modules. Put D? = Endr(V;). 
This is a division ring by Schur’s Lemma (Proposition 10.4b of Basic Algebra). 
Using Proposition 10.14 of Basic Algebra, we obtain an isomorphism of rings 


R° = Endg R ~ Home (@rivi, BV), (x) 
i=l j=l 


Define p; : Di- njV; — n;V; to be the i™ projection and gq; : niVi > 
Dia n;V; to be the i inclusion. Let us see that the right side of (x) is iso- 
morphic as a ring to []; Ender (nj Vi) via the mapping f + (pifqi..-.. Pr fr). 
What is to be shown is that pj fq; = 0 fori A j. Here p; fq; is a member 
of Home (nj; V;,nj; Vj). The abelian group Home(n;V;,n;V;) is the direct sum 
of abelian groups isomorphic to Home(V;, V;) by Proposition 10.12, and each 
Home(V;, V;) is 0 by Schur’s Lemma (Proposition 10.4a). 

Referring to (*), we therefore obtain ring isomorphisms 


R° = [| Homa(ni Vi, ni Vi) = [] Endr(ni Vi) 
i=l i=l 


~ [] Mn, (Ende (V;)) by Corollary 10.13 
i=l 

= [| M,,(D?) by definition of D?. 
i=l 


Reversing the order of multiplication in R° and using the transpose map to 
reverse the order of multiplication in each M,,(D?), we conclude that R = 
T]j=1 “Mn, (Di). This proves existence of the decomposition in the theorem. 

We still have to identify the simple left R modules and prove an appropriate 
uniqueness statement. As we recalled in Example 1, we have a decomposition 


84 I. Wedderburn-Artin Ring Theory 


M,,(D;) = D;" ®--- ® Dj" of left M,,(D;) modules, and each term D;" is a 
simple left M,,,(D;) module. The decomposition just proved allows us to regard 
each term Di as a simple left R module, 1 < i <r. Each of these modules 
is acted upon by a different coordinate of R, and hence we have produced at 
least r nonisomorphic simple left R modules. Any simple left R module must 
be a quotient of R by a maximal left ideal, as we observed in Example 2, hence 
a composition factor as a consequence of the Jordan—Hélder Theorem. Thus 
it must be one of the V;’s in the previous part of the proof. There are only 
r nonisomorphic such V;’s, and we conclude that the number of simple left R 
modules, up to isomorphism, is exactly r. 

For uniqueness suppose that R = My, (D)) X +++ X My (D5) as rings. Let 
Vi= (D')" be the unique simple left M,, (D‘) module up to isomorphism, and 
regard V; as a simple left R module. Then we have R = Dja1 nV? as left 
R modules. By the Jordan—Holder Theorem we must have r = s and, after a 
suitable renumbering, nj = ni and V; = V/ for 1 <i <r. Thus we have ring 
isomorphisms 


(D;)° = Endy, , wy (V;) by Lemma 2.1 
= Endg(V;) 
= Enda(V;) since V; = V/ 
aie 


Reversing the order of multiplication gives Di = Dj;, and the proof is complete. 


Corollary 2.3. For a ring R, left semisimple coincides with right semisimple. 


REMARK. Therefore we can henceforth refer to left semisimple rings unam- 
biguously as semisimple. 


PROOF. The theorem gives the form of any left semisimple ring, and each ring 
of this form is certainly right semisimple. 


Wedderburn’s original formulation of Theorem 2.2 was for algebras over a 
field F’, and he assumed finite dimensionality. The theorem in this case gives 


R= M,,, (D1) Xr XK Mn, (D,); 


and the proof shows that D? = Endr(V;), where V; is a minimal left ideal of 
R of the i isomorphism type. The field F lies inside Endr(V;), each member 
of F yielding a scalar mapping, and hence each D; is a division algebra over 
F.. Each D; is necessarily finite-dimensional over F’, since R was assumed to be 
finite-dimensional. 


2. Semisimple Rings and Wedderburn’s Theorem 85 


We shall make occasional use in this chapter of the fact that if D is a finite- 
dimensional division algebra over an algebraically closed field F, then D = F. 
To see this equality, suppose that x is a member of D but not of F, ie., is not an 
F multiple of the identity. Then x and F together generate a subfield F(x) of D 
that is a nontrivial algebraic extension of F, contradiction. Consequently every 
finite-dimensional semisimple algebra R over an algebraically closed field F is 
of the form 

R= M,,(F) x +--+ x Mn, (F), 


for suitable integers nj, ..., 7. 

AS we saw, the finite dimensionality plays no role in decomposing semisim- 
ple rings as the finite product of rings that we shall call “simple.” The place 
where finite dimensionality enters the discussion is in identifying simple rings 
as semisimple, hence in establishing a converse theorem that every finite direct 
product of simple rings, each equal to an ideal of the given ring, is necessarily 
semisimple. We say that a nonzero ring R with identity is simple if its only 
two-sided ideals are 0 and R. 


EXAMPLES OF SIMPLE RINGS. 


(1) If D is a division ring, then M,,(D) is a simple ring. In fact, let J be a 
two-sided ideal in M,,(D), fix an ordered pair (7, j) of indices, and let 


I = {x € D| some member X of J has Xj; = x}. 


Multiplying X in this definition on each side by scalar matrices with entries in 
D, we see that I is a two-sided ideal in D. If J = 0 for all (i, 7), then J = 0. 
So assume for some (i, j) that / #0. Then J = D for that (7, j), and we may 
suppose that some X in J has X;; = 1. If Ey; denotes the matrix that is 1 in 
the (k, 1)" place and is 0 elsewhere, then Ej; X Ej; = Ej; has to be in J. Hence 
Ey = Ej, Ej; Ej; has to be in J, and J = M,(D). 


(2) Let R be the Weyl algebra over C in one variable, namely 


d\n 
R= | oat PB; (=) | each P,, is in C[x], and the sum is finite}. 
n>0 


To give a more abstract construction of R, we can view R as 1 Es al subject to 
the relation oe x=Xx o£ + 1; this is not to be a quotient of a polynomial algebra 
in two variables but a quotient of a tensor algebra in two variables. We omit the 
details. We shall now prove that the ring R is simple but not semisimple. 


To see that R is a simple ring, we easily check the two identities 


a n aa mn n+1 
Gi) 20x" £) = mx"! £ 4.x" £2 by the product rule, 


dq’ ; 
dxn-!t ! 


(ii) & x=n x so by induction when applied to a polynomial f (x). 


86 I. Wedderburn-Artin Ring Theory 


Let J be a nonzero two-sided ideal in R, and fix an element X ~ O in J. Let x” 
be the highest power of x appearing in X, and let - be the highest power of fo 
appearing in terms of X involving x”. Let / andr denote “left multiplication by” 
and “right multiplication by,” and apply (1 (+) —r (4))” to X. Since (i) shows 
that 


(ae) — ae) (ae) = oR) 


the result of computing (I ( 4) — r(4))"Xx is a polynomial in =~ of degree 
exactly n with no x’s. Application of (r(x) — /(x))” to the result, using (ii), 
yields a nonzero constant. We conclude that | is in J and therefore that J = R. 
Hence R is simple. 

To show that R is not semisimple, first note that C[x] is a natural unital left R 
module. We shall show that R has infinite length as a left R module, in the sense 
of the length of finite filtrations. In fact, 


R > Ri) 2 Rie) 20 DRE)" (*) 
is a finite filtration of left R submodules of R. If R (2) =R ( aos then 
( ay = nay for some r € R. Applying these two equal expressions for 


a member of R to the member x* of the left R module C[x], we arrive at a 
contradiction and conclude that every inclusion in (+) is strict. Therefore R has 
infinite length and is not semisimple. 


The extra hypothesis that Wedderburn imposed so that simple rings would 
turn out to be semisimple is finite dimensionality. Wedderburn’s result in this 
direction is Theorem 2.4 below. This hypothesis is quite natural to the extent 
that the subject was originally motivated by the theory of Lie algebras. E. Artin 
found a substitute for the assumption of finite dimensionality that takes the result 
beyond the realm of algebras, and we take up Artin’s idea in the next section. 


Theorem 2.4 (Wedderburn). Let R be a finite-dimensional algebra with 
identity over a field F. If R is a simple ring, then R is semisimple and hence 
is isomorphic to M,,(D) for some integer n > 1 and some finite-dimensional 
division algebra D over F’. The integer n is uniquely determined by R, and D is 
unique up to isomorphism. 


PROOF. By finite dimensionality, R has a minimal left ideal V. For r in R, 
form the set Vr. This is a left ideal, and we claim that it is minimal or is 0. In 
fact, the function v +> ur is R linear from V onto Vr. Since V is simple as a 
left R module, Vr is simple or 0. The sum J = }°, with yr4o V7 is a two-sided 
ideal in R, and it is not 0 because V1 # 0. Since R is simple, J = R. Then the 
left R module R is exhibited as the sum of simple left R modules and is therefore 
semisimple. The isomorphism with M,,(D) and the uniqueness now follow from 
Theorem 2.2. 


3. Rings with Chain Condition and Artin’s Theorem 87 
3. Rings with Chain Condition and Artin’s Theorem 


Parts of Chapters VII and IX of Basic Algebra made considerable use of a 
hypothesis that certain commutative rings are “Noetherian,” and we now extend 
this notion to noncommutative rings. A ring R with identity is left Noetherian if 
the left R module R satisfies the ascending chain condition for its left ideals. It is 
left Artinian if the left R module R satisfies the descending chain condition for 
its left ideals. The notions of right Noetherian and right Artinian are defined 
similarly. 

We saw many examples of Noetherian rings in the commutative case in Basic 
Algebra. The ring of integers Z is Noetherian, and so is the ring of polynomials 
R[X] in an indeterminate over a nonzero Noetherian ring R. It follows from the 
latter example that the ring F[X 1, ..., X,,] in finitely many indeterminates over 
a field is a Noetherian ring. Other examples arose in connection with extensions 
of Dedekind domains. 

Any finite direct product of fields is Noetherian and Artinian because it has a 
composition series and because its ideals therefore satisfy both chain conditions. 
If p is any prime, the ring Z/ p*Z is Noetherian and Artinian for the same reason, 
and it is not a direct product of fields. 

In the noncommutative setting, any semisimple ring is necessarily left Noe- 
therian and left Artinian because it has a composition series for its left ideals and 
the left ideals therefore satisfy both chain conditions. 


Proposition 2.5. Let R be a ring with identity, and let M be a finitely generated 
unital left R module. If R is left Noetherian, then M satisfies the ascending 
chain condition for its R submodules; if R is left Artinian, then M satisfies the 
descending chain condition for its R submodules. 


PROOF. We prove the first conclusion by induction on the number of generators, 
and the proof of the second conclusion is completely similar. The result is trivial 
if M has O generators. If M = Rx, then M is a quotient of the left R module 
R and satisfies the ascending chain condition for its R submodules, according to 
Proposition 10.10 of Basic Algebra. For the inductive step with > 2 generators, 
write M = Rx, +---+ Rx, and N = Rx, +---+ Rx,_\. Then WN satisfies 
the ascending chain condition for its R submodules by the inductive hypothesis, 
and M/N is isomorphic to Rx,/(N M Rx), which satisfies the ascending chain 
condition for its R submodules by the inductive hypothesis. Therefore M satisfies 
the ascending chain condition for its R submodules by application of the converse 
direction of Proposition 10.10. 


Artin’s theorem (Theorem 2.6 below) will make use of the hypothesis “left 
Artinian” in identifying those simple rings that are semisimple. The hypothesis 


88 I. Wedderburn-Artin Ring Theory 


left Artinian may therefore be regarded as a useful generalization of finite dimen- 
sionality. Before we come to that theorem, we give a construction that produces 
large numbers of nontrivial examples of such rings. 


EXAMPLE (triangular rings). Let R and S be nonzero rings with identity, and 
let M be an (R, S) bimodule.? Define a set A and operations of addition and 
multiplication symbolically by 


R M rem 
a=(9 S)={(6 7) 

ath rom rom \ _ (rr rm'+ms' 
0 s 0 s'} VO ss! : 


Then A is a ring with identity, the bimodule property entering the proof of 
associativity of multiplication in A. We can identify R, M, and S with the 


additive subgroups of A given by & a (? i and ( a): Problems 8-11 at 


the end of the chapter ask one to check the following facts: 
(i) The left ideals in A are of the form J; 6 Jb, where J) is a left ideal in S$ 
and J; is a left R submodule of R  M containing M1). 
(ii) The right ideals in A are of the form J; ® Jz, where J; is a right ideal in 
R and J2 is aright S submodule of M © S containing J; M. 
(iii) The ring A is left Noetherian if and only if R and S are left Noetherian 
and M satisfies the ascending chain condition for its left R submodules. 
The ring A is right Noetherian if and only if R and S are right Noetherian 
and M satisfies the ascending chain condition for its right S submodules. 
(iv) The previous item remains valid if “Noetherian” is replaced by 
“Artinian” and “ascending” is replaced by “descending.” 


reR,meM,s s| 


(v) If A = e a is a ring such as 6 2) in which S is a (commutative) 
Noetherian integral domain with field of fractions R and if S # R, then 
A is left Noetherian and not right Noetherian, and A is neither left nor 
right Artinian. 

(vi) IfA = ( F : ) is aring suchas ( es a ) ) in which R and S are fields with 
S C R and dims R infinite, then A is left Noetherian and left Artinian, 
and A is neither right Noetherian nor right Artinian. 


From these examples we see, among other things, that “left” and “right” are 
somewhat independent for both the Noetherian and the Artinian conditions. We 


3This means that M is an abelian group with the structure of a unital left R module and the 
structure of a unital right S module in such a way that (rm)s = r(ms) for allr € R,m € M, and 
ses. 


4. Wedderburn-Artin Radical 89 


already know from the commutative case that Noetherian does not imply Artinian, 
Z being a counterexample. We shall see in Theorem 2.15 later that left Artinian 
implies left Noetherian and that right Artinian implies right Noetherian. 


Theorem 2.6 (E. Artin). If R is a simple ring, then the following conditions 
are equivalent: 


(a) R is left Artinian, 

(b) R is semisimple, 

(c) R has a minimal left ideal, 

(d) R = M,,(D) for some integer n > 1 and some division ring D. 


In particular, a left Artinian simple ring is right Artinian. 


REMARK. Theorem 2.4 is a special case of the assertion that (a) implies 
(d). In fact, if R is a finite-dimensional algebra over a field F’,, then the finite 
dimensionality forces R to be left Artinian. 


PROOF. It is evident from Wedderburn’s Theorem (Theorem 2.2) that (b) and 
(d) are equivalent. For the rest we prove that (a) implies (c), that (c) implies (b), 
and that (b) implies (a). 

Suppose that (a) holds. Applying the minimum condition for left ideals in R, 
we obtain a minimal left ideal. Thus (c) holds. 

Suppose that (c) holds. Let V be a minimal left ideal. Then the sum J = 
der Vr is a two-sided ideal in R, and it is nonzero because the term for r = | 
is nonzero. Since R is simple, J = R. Then the left R module R is spanned by 
the simple left R modules Vr, and R is semisimple. Thus (b) holds. 

Suppose that (b) holds. Since R is semisimple, the left R module R has a 
composition series. Then the left ideals in R satisfy both chain conditions, and it 
follows that R is left Artinian. Thus (a) holds. 


4. Wedderburn—Artin Radical 


In this section we introduce one notion of “radical” for certain rings with identity, 
and we show how it is related to semisimplicity. This notion, the “Wedderburn— 
Artin radical,” is defined under the hypothesis that the ring is left Artinian. It is 
not the only notion of radical studied by ring theorists, however. There is a useful 
generalization, known as the “Jacobson radical,” that is defined for arbitrary rings 
with identity. We shall not define and use the Jacobson radical in this text. 

Fix a ring R with identity. A nilpotent element in R is an element a with 
a" = 0 for some integer n > 1. A nil left ideal is a left ideal in which every 
element is nilpotent; nil right ideals and nil two-sided ideals are defined similarly. 


90 I. Wedderburn-Artin Ring Theory 


A nilpotent left ideal is a left ideal J such that 7” = 0 for some integer n > 1, 
i.e., for which a; --- a, = 0 for all n-fold products of elements from /; nilpotent 
right ideals and nilpotent two-sided ideals are defined similarly. 


Lemma 2.7. If J, and J, are nilpotent left ideals in a ring R with identity, then 
I, + Jy is nilpotent. 


PRooF. Let J} = 0 and J; = 0. Expand (J; + Ih) as Ui, Li, +++ Ti, with each 
ij equal to 1 or 2. Take k =r +s. In any term of the sum, there are > r indices 1 
or > s indices 2. In the first case let there be ¢ indices 2 at the right end. Since 
InI, © ,, we can absorb all other indices 2, and the term of the sum is contained 
in /{ 1, = 0. Similarly in the second case if there are t’ indices | at the right end, 
then the term is contained in [3 1/ = 0. 


Lemma 2.8. If J is a nilpotent left ideal in a ring R with identity, then J is 
contained in a nilpotent two-sided ideal J. 

PROOF. Put J = }°.-p Ir. This is a two-sided ideal. For any integer k > 0, 
J = (Dyer tr)’ © Y,, Inire Ire © YS, Tere. If TF = 0, then 
JX =0. 


Lemma 2.9. If R is a ring with identity, then the sum of all nilpotent left ideals 
in a nil two-sided ideal. 


PROOF. Let K be the sum of all nilpotent left ideals in R, and let a be a member 
of K. Writea = a, +---+ a, witha; € J; for a nilpotent left ideal J;. Lemma 
2.7 shows that J = )~/_, J; is anilpotent left ideal. Since a is in J, a is a nilpotent 
element. 

The set K is certainly a left ideal, and we need to see that aR is in K in order to 
see that K is a two-sided ideal. Lemma 2.8 shows that J C J for some nilpotent 
two-sided ideal J. Then J C K because J is one of the nilpotent left ideals 
whose sum is K. Since a is in J and therefore in J and since J is a two-sided 
ideal, aR is contained in J. Therefore aR is contained in K, and K is atwo-sided 
ideal. 


Theorem 2.10. If R is a left Artinian ring, then any nil left ideal in R is 
nilpotent. 


REMARK. Readers familiar with a little structure theory for finite-dimensional 
Lie algebras will recognize this theorem as an analog for associative algebras of 
Engel’s Theorem. 


PROOF. Let J be a nil left ideal of R, and form the filtration 


TSS Stk 


4. Wedderburn-Artin Radical 91 


Since R is left Artinian, this filtration is constant from some point on, and we 
have [* = J+! = [+2 =... for some k > 1. Put J = I*. We shall show that 
J = 0, and then we shall have proved that J is a nilpotent ideal. 

Suppose that J # 0. Since J? = [** = Jk = J, we have J* = J. Thus the 
left ideal J has the property that J J 4 0. Since R is left Artinian, the set of left 
ideals K C J with JK 4 0 has a minimal element Kp. Choose a € Ko with 
Ja #0. Since Ja C JKo © Ko and J(Ja) = J?a = Ja # 0, the minimality 
of Ko implies that Ja = Ko. Thus there exists x € J with xa = a. Applying 
powers of x, we obtain x"a = a for every integer n > 1. But x is a nilpotent 
element, being in /, and thus we have a contradiction. 


Corollary 2.11. If R is a left Artinian ring, then there exists a unique largest 
nilpotent two-sided ideal J in R. This ideal is the sum of all nilpotent left ideals 
and also is the sum of all nilpotent right ideals. 


REMARKS. The two-sided ideal J of the corollary is called the Wedderburn— 
Artin radical of R and will be denoted by rad R. This exists under the hypothesis 
that R is left Artinian. 


PROOF. By Lemma 2.9 and Theorem 2.10 the sum of all nilpotent left ideals in 
R is atwo-sided nilpotent ideal 7. Lemma 2.8 shows that any nilpotent right ideal 
is contained in a nilpotent two-sided ideal J. Since J is in particular a nilpotent 
left ideal, the definition of J forces J C J. Hence the sum of all nilpotent right 
ideals is contained in /. But J itself is a nilpotent right ideal and hence equals 
the sum of all the nilpotent right ideals. 


Lemma 2.12 (Brauer’s Lemma). If R is any ring with identity and if V isa 
minimal left ideal in R, then either V? = 0 or V = Re for some element e of V 
with e? =e. 


REMARK. Anelement e with the property that e? = e is said to be idempotent. 


PROOF. Being a minimal left ideal, V is a simple left R module. Schur’s 
Lemma (Proposition 10.4b of Basic Algebra) shows that Ende V is a division 
ring. If a is in V, then the map v > va of V into itself lies in Endg V and hence 
is the 0 map or is one-one onto. If it is the 0 map for all a € V, then V7? = 0. 
Otherwise suppose that a is an element for which v + va is one-one onto. Then 
there exists e € V with ea = a. Multiplying on the left by e gives e?a = ea and 
therefore (e? —e)a = 0. Since the map v +> va is assumed to be one-one onto, 
we must have e” — e = 0 and e* =e. 


Theorem 2.13. If R is a left Artinian ring and if the Wedderburn—Artin radical 
of R is 0, then R is a semisimple ring. 


92 I. Wedderburn-Artin Ring Theory 


REMARKS. Conversely semisimple rings are left Artinian and have radical 0. 
In fact, we already know that semisimple rings have a composition series for 
their left ideals and hence are left Artinian. To see that the radical is 0, apply 
Theorem 2.2 and write the ring as R = M,,,(D,) x--- x M,,(D,). The two-sided 
ideals of R are the various subproducts, with 0 in the missing coordinates. Such a 
subproduct cannot be nilpotent as an ideal unless it is 0, since the identity element 
in any factor is not a nilpotent element in R. 


PROOF. Let us see that any minimal left ideal J of R is a direct summand as a 
left R submodule. Since rad R = 0, J is not nilpotent. Thus / 2 £0, and Lemma 
2.12 shows that J contains an idempotent e. This element satisfies J = Re. Put 
I'={r € R| re =0}. Then J’ isa left ideal in R. Since J’ J C J ande is 
not in J’, the minimality of J forces [/N J = 0. Writing r = re + (r — re) with 
reé€landr—re € I’, wesee that R =1+1'. ThereforeR=/ OI’. 

Now put J; = J. If J’ is not 0, choose a minimal left ideal I, € I’ by the 
minimum condition for left ideals in R. Arguing as in the previous paragraph, we 
have In = Re» for some element e2 with e5 = é). The argument in the previous 
paragraph shows that R = I, ® Ij, where I, = {r € R | rez = O}. Define 1” = 
{r © R| rey =rey = 0} = 1'N 15. Since Jy is contained in J’, we can intersect 
R= 1) @ Jj with I’ and obtain J’! =) @1". ThenR=) Ol =hOhel". 
Continuing in this way, we obtain R = 1,;05@61@1", etc. As this construction 
continues, we have J’ D I” D I” > .--. Since R is left Artinian, this sequence 
must terminate, evidently in 0. Then R is exhibited as the sum of simple left R 
modules and is semisimple. 


Corollary 2.14. If R is a left Artinian ring, then R/ rad R is a semisimple ring. 


PROOF. Let J = rad R, and let g : R — R/T be the quotient homomorphism. 
Arguing by contradiction, let J be a nonzero nilpotent left ideal in R/J, and let 
J = yg} (J) C R. Since J is nilpotent, J‘ CI for some integer k > 1. But 
I, being the radical, is nilpotent, say with / ! — 0, and hence J**’ Cc J! = 0. 
Therefore J is a nilpotent left ideal in R strictly containing /, in contradiction to 
the maximality of J. We conclude that no such J exists. Then rad(R/ rad R) = 0. 
Since R/ rad R is left Artinian as a quotient of a left Artinian ring, Theorem 2.13 
shows that R/ rad R is a semisimple ring. 


We shall use this corollary to prove that left Artinian rings are left Noetherian. 
We state the theorem, state and prove a lemma, and then prove the theorem. 


Theorem 2.15 (Hopkins). If R is a left Artinian ring, then R is left Noetherian. 


4. Wedderburn-Artin Radical 93 


Lemma 2.16. If R is a semisimple ring, then every unital left R module M 
is semisimple. Consequently any unital left R module satisfying the descending 
chain condition has a composition series and therefore satisfies the ascending 
chain condition. 


PRooF. For each m € M, let R», be a copy of the left R module R, and 
define M = @,,,cyu Rm as a left R module. Since each R,, is semisimple, M is 
semisimple. Define a function ¢ : M — Mas follows: if I'm, +++ +m, is given 
with rm, in Rm, for each j, let Pm, + +++ + Tm) = yee 
R module map with the property that g(1,,) = m, and consequently @ carries M 
onto M. As the image of a semisimple R module under an R module map, M is 
semisimple. 

Now suppose that M is a unital left R module satisfying the descending chain 
condition. We have just seen that M is semisimple, and thus we can write 
M = @ <5 Mi as a direct sum over a set S of simple left R modules M;. Let us 
see that S is a finite set. If S were not a finite set, then we could choose an infinite 
sequence i), i2,... of distinct members of S, and we would obtain 

=) is, Hoo aoe 
MzOMi2 @ Mz , 
in contradiction to the fact that the R submodules of M satisfy the descending 
chain condition. 


rm,;mj;. Then ¢ is an 


PROOF OF THEOREM 2.15. Let J = rad R. Since J is nilpotent, 7” = 0 for 
some n. Each I* for k > 0 is a left R submodule of R. Since R is left Artinian, 
its left R submodules satisfy the descending chain condition, and the same thing 
is true of the R submodules of each J*. Consequently the R submodules of each 
T/T satisfy the descending chain condition. 

In the action of R on I*/I**! on the left, J acts as 0. Hence I‘ /I**! becomes 
aleft R/J module, and the R/J submodules of this left R/7 module must satisfy 
the descending chain condition. Corollary 2.14 shows that R/J = R/rad R is 
a semisimple ring. Since the R/J submodules of I‘/I**! satisfy the descend- 
ing chain condition, Lemma 2.16 shows that these R/J submodules satisfy the 
ascending chain condition. Therefore the R submodules of each left R module 
T*/T** satisfy the ascending chain condition. 

We shall show inductively for k > 0 that the R submodules of R/I**! satisfy 
the ascending chain condition. Since J” = 0, this conclusion will establish that 
R is left Noetherian, as required. The case k = O was shown in the previous 
paragraph. Assume inductively that the R submodules of R/J* satisfy the 
ascending chain condition. Since R/J* = (R/I**!)/(I*/I**!) and since the 
R submodules of R/I* and of I‘ /I**! satisfy the ascending chain condition, the 
same is true for R/J‘*!. This completes the proof. 


94 I. Wedderburn-Artin Ring Theory 


5. Wedderburn’s Main Theorem 


Wedderburn’s Main Theorem is an analog for finite-dimensional associative 
algebras over a field of characteristic 0 of the Levi decomposition of a finite- 
dimensional Lie algebra over a field of characteristic 0. Each of these results says 
that the given algebra is a “semidirect product” of the radical and a semisimple 
subalgebra isomorphic to the quotient of the given algebra by the radical. In other 
words, the whole algebra, as a vector space, is the direct sum of the radical and a 
vector subspace that is closed under multiplication. 

An example of this phenomenon occurs with a block upper-triangular subal- 
gebra A of M,(D) whenever D is a finite-dimensional division algebra over the 
given field. Let the diagonal blocks be of sizes nj, ...,, withn;+---++-n, =n. 
The radical rad A is the nilpotent ideal of all matrices whose only nonzero entries 
are above and to the right of the diagonal blocks, and the semisimple subalgebra 
consists of all matrices whose only nonzero entries lie within the diagonal blocks. 


Theorem 2.17 (Wedderburn’s Main Theorem). Let A be a finite-dimensional 
associative algebra with identity over a field F of characteristic 0, and let rad A be 
the Wedderburn—Artin radical. Then there exists a subalgebra S of A isomorphic 
as an F algebra to A/rad A such that A = S @ rad A as vector spaces. 


REMARKS. The finite dimensionality implies that A is left Artinian, and 
Corollary 2.14 shows that A/ rad A is a semisimple algebra. The decomposition 
A= S@rad Ais different in nature from the one in Theorem 2.2, which involves 
complementary ideals. When there are complementary ideals, the identity of A 
decomposes as the sum of the identities for each summand. Here the identity of 
A is the identity of S$ and has 0 component in rad A. To see this, write 1 =a+b 
witha € Sand b € rad A. Multiplying 1 = a + b on the left and right by s € S, 
we see that as = s = sa and that bs = sb = 0. Hence a = 1s is the identity of 
S. Then b? = (1 — 15)? =1—2-1s +13 =1-—2-1s+1s =1—15 =b, and 
b" = b foralln > 1. Since rad A is nilpotent, b” = 0 for some n. Thus b = 0, 
and 1 = ls as asserted. 


Theorem 2.17 is a deep result, and the proof will occupy all of the present 
section and the next. The key special case to understand occurs when A/ rad A = 
M,,(F) x --+ x M,,(F). We shall handle this case by means of Theorem 2.18 
below, whose proof will be the main goal of the present section. Corollary 2.27 (of 
Theorem 2.18) near the end of this section will show that Theorem 2.18 implies 
this special case of Theorem 2.17 for r = 1, and Corollary 2.28 will deduce this 
special case of Theorem 2.17 for general r from Corollary 2.27. 


5. Wedderburn’s Main Theorem 95 


Theorem 2.18. Let A be a left Artinian ring with Wedderburn—Artin radical 
rad A, and suppose that A/ rad A is simple, i.e., is of the form A/ rad A = M,,(D) 
for some division ring D. Then A is isomorphic as a ring to M,,(R) for some left 
Artinian ring R such that R/rad R = D. 


The idea behind the proof of Theorem 2.18 is to give an abstract characteri- 
zation of a ring of matrices in terms of the elements F;; that are | in the (i, i" 
place and are 0 elsewhere. In turn, these elements arise from the diagonal such 
elements E;;, which are idempotents, i.e., have E? = E;;. The critical issue in 
the proof of Theorem 2.18 is to show that each idempotent of A/rad A, which is 
assumed to be a full matrix ring M,,(D), has an idempotent in its preimage in A. 
The lifted idempotents then point to M,,(R) for a certain R. 

Thus we begin with some discussion of idempotents. We shall intersperse 
facts about general rings with facts about left Artinian rings as we go along. For 
the moment let R be any ring with identity, and let e be an idempotent. Then 
1 — e is an idempotent, and we have the three Peirce* decompositions 


R= Re® RC -— e@), 
R=eR@(-—e)R, 
R=eRe ®eR(1 —e) @U —e)Re @ (1 —e) RU — @). 


All the direct sums may be regarded as direct sums of abelian groups. The two 
members of the right side in the first case are left ideals, and the two members of 
the right side in the second case are right ideals. If r € R is given, then the first 
decomposition is as r = re + r(1 — e); the decomposition is direct because if 
rjé =r2(1 —e), then right multiplication by e gives re = 0 since e” = e. The 
second decomposition is proved similarly, and the third decomposition follows 
by combining the first two. In the third decomposition, eRe is a ring with e as 
identity, and (1 — e)R(1 — e) is aring with 1 — e as identity. 


EXAMPLE. Let R = M,,(F), and let 


1 0 
e= , sothat l1-e= 
0 1 


4Pronounced “purse.” Charles Sanders Peirce (1839-1914). 


96 I. Wedderburn-Artin Ring Theory 


In block form we then have 


0 0 
eRe= (4 ae era = (5 a 


0 0 0 O 
-oRre=(? ae a-ord-9=(5 A) 


Proposition 2.19. In a ring R with identity, let e be an element of R with 
e’ =e. 

(a) If J is a left ideal in eRe, then eRI = J. Hence J + RI is a one-one 
inclusion-preserving map of the left ideals of eRe to those of R. 

(b) If J is a two-sided ideal of eRe, then e(RJR)e = J. Hence J+ RJR 
is a one-one inclusion-preserving map of the two-sided ideals of eRe to those of 
R. This map respects multiplication of ideals. 

(c) If, J is a two-sided ideal of R, then eJe is a two-sided ideal of eRe, and 
eReNnJ =elJe. 


PROOF. For (a), we have eRI = eR(el) = (eRe)I = IJ, the first equality 
holding because e is the identity in eRe and the third equality holding because 
eRe contains its identity e. The rest of (a) then follows. 

For (b), J satisfies J = eJe, since ej = je = j for every j € eRe, and 
therefore eRJ Re = eReJeRe = (eRe) J (eRe) = J, the last equality holding 
because eRe contains its identity e. To see that J ++ RJR respects multi- 
plication, we compute that (RJR)(RJ’R) = RJRJ'R = R(Je)R(eJ)R = 
RJ (eRe) J'R = RJJ'R. - 

For (c),eReNJ D eJe certainly. In the reverse direction, let j beineRen J. 
Then j = ere for somer € R, and hence eje = e’re* = ere = j shows that j 
is ineJe. 


Corollary 2.20. In a left Artinian ring R, let e be an element with e? = e. 
Then the ring eRe is left Artinian, and 


rad(eRe) = eRe NM rad R = e(rad R)e. 


If R denotes the quotient ring R/ rad R and é denotes the element e + rad R of the 
quotient, then the quotient map carries eRe onto éRé and has kernel rad(e Re). 
Consequently 

eRe/rad(eRe) = éRe. 


PROOF. The ring eRe is left Artinian as an immediate consequence of Propo- 
sition 2.19a. For the first display we may assume that R and eRe are both left 
Artinian. Then eRe M rad R is a two-sided ideal of eRe, and (eRe M rad R)” C 


5. Wedderburn’s Main Theorem 97 


(rad R)" for every n. Since (rad R)N = 0 for some N, eRe Nrad R is nilpotent, 
and eRe MradR C rad(eRe). Since the reverse inclusion is evident, we obtain 
rad(eRe) = eRe (rad R. The equality eRe M rad R = e(rad R)e is the special 
case of Proposition 2.19c in which J = rad R. This proves the equalities in the 
first display. 

For the isomorphism in the second display, the quotient mapping carries ere 
to ere +radR = (e+ rad R)(r + rad R)(e + rad R) = e(r + rad Re. Thus 
the quotient map R — R carries eRe onto @Ré. The kernel is eRe M rad R, 
which we have just proved is rad(eRe). Therefore the quotient map exhibits an 
isomorphism of rings eRe/rad(eRe) = éRé. 


Proposition 2.21. Ina ring R with identity, let e; and e2 be idempotents. Then 
the unital left R modules Re, and Re are isomorphic as left R modules if and 
only if there exist elements e;7 and ez; in R such that 


e1e€12€2 = 12, e2€21€1 = 21, 


€12€21 = @1, €21€12 = @2. 


REMARK. In this case we shall say that e; and e2 are isomorphic idempotents, 
and we shall write e; = eo. 


PROOF. Let g : Re; — Rez be an R isomorphism. Define ej2 = ¢g(e;) 
and ey, = g '(e2). Every element s of Rez has the property that sez = s 
because e5 = €); since 12 lies in Reo, €)2€2 = e127. Meanwhile, e)7 = y(e)) = 
(er) = e,y(e1) = e;e12. Putting these two facts together gives ej. = ej2e2 = 
e1e12e2. This proves the first equality in the display, and the equality e2; = 
€2€21e1 18 proved similarly. Also, ey = ge '(v(e1)) — gp '(e12) — gp '(e12e2) — 
e129 | (e2) = €12€21, and similarly e21e12 = e2. This completes the proof that 
an R isomorphism Re; = Rez leads to elements e12 and e2; such that the four 
displayed identities hold. 

For the converse, suppose that e;2 and e2; exist and satisfy the four displayed 
identities. Define g : Rey — R by g(re,) = rejo. To see that this map is well 
defined, suppose that re; = 0; then rejz = r(e,ej2e2) = (re,)ey2e2 = O, as 
required. Similarly we can define y : Rez > R by W(re2) = reg ,. Then 


wolei) = Pez) = Wlei2e2) = en (e2) = e12e21 = 1, 


and similarly gw(e2) = e2. Since wg and gw are R module homomorphisms, 
each is the identity on its domain. 


Corollary 2.22. Let R be a left Artinian ring. For eachr in R, let7 be the coset 
r+radR in R/rad R. If e; and eg are idempotents in R, then e; and e2 are 
isomorphic if and only if e; and é2 are isomorphic. 


98 I. Wedderburn-Artin Ring Theory 


PROOF. If e; and e are given as isomorphic in R, let e;2 and e; be as in 
Proposition 2.21, and pass to R/ rad R by the quotient homomorphism to obtain 
elements @17 and é2, that exhibit e¢; and é as isomorphic idempotents. 

Conversely let €; and é2 be isomorphic idempotents in R/rad R, and use 
Proposition 2.21 to produce elements u12 and v2; in R/rad R such that 


€;Uj2@2 = Uy2, @2U21@) = U2}, Uy2U21 =}, 22 = @. 


Let u 2 and v2; be preimages of #12 and v2; in R. Possibly replacing u 1 by ey uj2e2 
and uw; by e2u21;e,, we may assume that eju}2e2 = Uy2 and epU7\e; = U1. Our 
construction is such that uwj2u21 = e; — z; with z; inrad R and e;z; = z; = z1e1. 
Since z; is a nilpotent element, 


(ce) —zer t2zrt+zp+°°° +24) =e1 


as soon as gir = 0. Thus we have wj2u2;(e; +z) + ae +eee t+ 2t) = e. 
Define e)2 = uy2 and e2; = ur, (e, +2) + aa +-+++-+ 21). Then it is immediate 
that @j7 = W412, 2, = Uy, and eje2; = ey. Also, the equality e;uj2e2 = uy 
implies that e,e;2e2 = e12, and the equality eyu.;e;(e; + z; 4 a bees + zi) = 
uri (e; +21 +24 +:+: +27) implies that e2¢2;¢) = ey) since e1Z; = Z1 = 7141. 

In view of Proposition 2.21, we are left with checking the value of e2);e;2. We 
know that @)é@)2 = U2,U12 = é2, and hence e2)e}2 = e2 — Z2 for some z2 inrad R. 
Multiplying by e2 on both sides, we see that 


C222 = £2 = £262. (x) 


Now (e1€12)(eo1e12) = e21e1€12 = e112, and thus (e2 — z2)? = e) — 2). 
Expanding out this equality and using (*) gives ey — 2z2 + z3 = e2 — z2 and 
therefore gives ZG = Z2. Hence z5 = zz for every n > 1. But zz is in rad R, and 
every element of rad R is nilpotent. Thus z2 = 0, and e;7e2; = e; as required. 


The proof of Corollary 2.22 shows a little more than the statement asserts, 
and we shall use this little extra conclusion when we finally get to the proof of 
Theorem 2.18. The extra fact is that any elements #12 and uv; exhibiting e; and 
€7 have lifts to elements e;7 and e2; exhibiting e; and e2 as isomorphic. 

The critical step of lifting a single idempotent from A/rad A to A is accom- 
plished by the following proposition. 


Proposition 2.23. Let R be a left Artinian ring. For each r in R, let r be the 
element r + rad R of R/rad R. If a is an element of R such that a is idempotent 
in R/rad R, then there exists an idempotent e in R such that e = a. 


5. Wedderburn’s Main Theorem 99 


PROooF. Set b = 1 — a. The elements a and b commute, and ab = a(1 — a) 
maps to @— 4” = 0 in R/ rad R, since @ is idempotent. Therefore ab lies in rad R 
and must satisfy (ab)” = 0 for some n. Since a and b commute, we can apply 
the Binomial Theorem to obtain 


2n 


L=G@+b"= yO). 
k=0 
n 2n 
Define eS a sh. and: Pe aoe 
k=0 k=n+1 


Each term of e contains at least the n™ power of a, and each term of b contains at 
least the n" power of b. Thus each term of ef contains at least a factor a"b" = 
(ab)” = 0, and we see that ef = 0. Thereforee = el = e(e+ f) = +0 =e’, 
and e is an idempotent. Each term of e except the one for k = 0 contains a factor 
ab, and thus e = a” mod rad R. Since a is idempotent, a”" =a mod rad R, and 
therefore e = a. 


For the proof of Theorem 2.18, we need to lift an entire matrix ring to obtain a 
matrix ring, and this involves lifting more than a single idempotent. In effect, we 
have to lift compatibly an entire system e;; that behaves like the usual system of 
E;; for matrices. The idea is that if R/rad R is a matrix ring M,(K) with some 
ring of coefficients K, then the i and j columns of M,(K) may be described 
compatibly as M,(K )e;; and M,(K)e;;. Proposition 2.23 allows us to lift é;; 
and e;; to idempotents e;; and e;;, and Corollary 2.22 shows that an isomorphism 
éj; = é;; implies an isomorphism e;; = e;;. The isomorphism gives us elements 
e;; and e;;, and then we can piece these together to form matrices. 

Two idempotents e and f in aring R with identity are said to be orthogonal 
ifef =0= fe. Suppose that e;,..., é, are mutually orthogonal idempotents 
such that )~_, e; = 1. Let us see in this case that 


R=Re,®---@ Re, 


as left R modules. In fact, the condition }~_, e; = 1 shows that r = )~"_, re; 
foreachr € R,and thus R = Re, +---+ Rey. Ifr lies in Re; N igi Re;, then 
P= seen r= z;Tiei- Multiplying the first of these equalities on the right 
by e; gives re; = se; = se; =r. Hence the second of these equalities, upon 
multiplication by e;, yields r = re; = >o; zjrieie; = 0. In other words, the sum 
is direct, as asserted. 


Corollary 2.24. Let R be a left Artinian ring. For eachr in R, letr be the coset 
r +rad R in R/rad R. If x and y are orthogonal idempotents in R = R/rad R 
and if e is an idempotent in R with e = x, then there exists an idempotent f in 
R with f = y andef = fe =0. 


100 I. Wedderburn-Artin Ring Theory 


PROOF. By Proposition 2.23 choose an idempotent fo in R with fo = y. Then 
foe has foe = yx = 0. Hence foe is in rad R, and (foe)"*! = 0 for some n. 
Consequently 1 + foe + (foe)? +--- + (foe)” is a two-sided inverse to 1 — fe. 
Define 


f =(1—e(1 + foe + (foe)? +--+ + (foe)”) fol — foe). 
Then f = (1—x)(y+0+---+0)y(1—0) = (1—x)y = y—xy = y. Moreover, 


fe=(1—e)(14+ foe + (foe)? +--+ + (foe)") (foe — foe’) =0 


since foe — ae = foe — foe = 0, and 


ef =e(1—e)(1+ foe + (foe)? +--+ + (foe)") fol — foe) = 0 


since e(1 — e) = 0. 
We still need to see that f? = 0. Since fo(1 — foe) = fo(1 — e), we can write 
f=(U—e)0 + foe +--+) fo(l — e) and 


f? =(—e)(1 + foe +--+) fol — e+ foe +--+) fo(l — e) 
=(1—e)d + foe +--+) fol — foe) + foe +--+) fo(l — e) 
=e) foe to) for l- fol Se) 
=(1—e) + foe+---) fol — foe) 


as required. 


Corollary 2.25. Let R be a left Artinian ring. For eachr in R, letr be the coset 
r+radR in R/radR. If {x1,..., xy} is a finite set of mutually orthogonal 
idempotents in R = R/rad R, then there exists a set of mutually orthogonal 
idempotents {e;,..., ey} in R such that e; = x; for alli. If ae x; = 1, then 
ae eal. 

PROOF. For the existence of {x1,..., xv}, we proceed by induction on N, the 
case N = | being Proposition 2.23. Suppose we have found e,..., @, and we 
want to find e,,,. Let e be the idempotent e; + --- + e,, and apply Corollary 
2.24 to the idempotent e in R and the idempotent x, 4; in R/ rad R. The corollary 
gives us é,41 orthogonal to e with é,4; = x,41. Since e; = eje = ee; fori <n, 
we obtain €,41e€; = @n41(ee;) = (€n41e)e; = O and similarly e;e,4; = 0 for 
those i’s, and the induction is complete. 

Finally }°; x; = 1 implies that }0;e; = 1+ 7 for some r in rad R. Then 
the idempotent 1 — 5°; e; is exhibited as in rad R and must be 0 because every 
element of rad R is nilpotent. 


5. Wedderburn’s Main Theorem 101 


In a nonzero ring R with identity, a finite subset {ei | i,j € {l,..., n}} is 
called a set of matrix units in R if yy é;, = 1 and e;jex; = 4;,e;; for all 
i, j, k,l. It follows from these conditions that the e;; are mutually orthogonal 
idempotents with sum 1, since e;je;; = 5;;e:; = 5;jei;. In view of the remarks 
before Corollary 2.24, we automatically have R = @j_, Re;;. In addition, the 
product rule gives ejjeijejj = ij, ejjejieii = Cji» Cijeji = Cit AN Cj1eij = Cjj3 
by Proposition 2.21 the idempotents e;; and e;; are isomorphic in the sense that 
there is a left R module isomorphism Re;; = Re;;. 

If A = M,(R), define E;; to be the matrix that is 1 in the (i, je place and 
is 0 elsewhere. Then it is immediate that {£;;} is a set of matrix units in A. To 
recognize matrix rings, we prove the following converse. 


Proposition 2.26. For a nonzero ring A with identity, suppose that 


felt SAS .centt 


is a set of matrix units in A. Let R be the subring of A of all elements of A 
commuting with all e;;. Then every element of A can be written in one and only 
one way as )); ,rijeij with ri; €¢ R for alli and j, and the map A > M,(R) 
given by a +> [r;;] is a ring isomorphism. The ring R can be recovered from A 
by means of the isomorphism R = e); Ae}). 


PROOF. To each a € A, associate the matrix [r;;] in M,(A) whose entries are 
given by rjj = >>, exiaejx. Then 


rijelm = Y. Ckidejk€im = Y_ CkiASK1ejm = CA jm, (*) 
k k 
and CimVij = Y. Cimeniejk = D~ Smeetiae jk = e1;Aejm. 
k k 


Thus rjj@im = €1i4€jm = €imrij. Because of the definition of R, this equality 
shows that r;; is in R. In particular, [r;;] is in M,(R). A special case of (*) is 
that Vijeij = Ciiae;j. Hence 


Derijeij = LL enaejj = lal =a. 
i,j i,j 


This proves that a can be expanded as a = 0; , rijeij- 


j 
For uniqueness, suppose that a = 7; ; sijeij is given with each sj; in R. 


Multiplication on the left by e;, and right by e,;, followed by addition, gives 


Vpq = Dd eepaege = Yep (DX sijeis)eqx = Do siyerpeizege = DY Spqeex = Spq- 
k k i,j i,j,k k 


102 I. Wedderburn-Artin Ring Theory 


This proves that the map A — M,,(R) is one-one onto. 

To see that the map A > M,,(R) respects multiplication, let a and a’ be in 
A, and let the effect of the map on a, a’, and aa’ be a > [r;j], a’ [rj], and 
aa’ +> [si;]. Then we have , 


- / / / 
Verity; = Yo euiaeewia'eje = Dd) exiaena’ejn = Dd) exiaa’ejx = sij, 
I 1,k,k’ 1,k k 


and the matrix product of the images of a and a’ coincides with the image of aa’. 

Finally consider the image £1; = [r;;] of the element a = e, of A. It has 
Vij = doy ekierseje = 91151; Yop eke = 51151 ;. If a is a general element of A and 
its image is [r;;], then the result of the previous paragraph shows that e1;ae11 
maps to Ey, [r;j]E.,; = 11; £11. Hence the map e;;a@e); +> rj is an isomorphism 
of e;; Ae;; with R. 


PROOF OF THEOREM 2.18. Let {x3 tee iin n}} be a set of matrix 
units for the matrix ring A/radA = M,(D). Then x11,..., Xnn are mutually 
orthogonal idempotents in A/rad A with sum 1. By Corollary 2.25 we can 
choose mutually orthogonal idempotents e);,..., @,, in A with ar é;; = 1 
and with é;; = x;;. 

We observed at the time of defining matrix units that x11, ..., X» are isomor- 
phic as idempotents. Corollary 2.22 shows as a consequence that €11,..., €nn 


are isomorphic as idempotents. The remarks following Corollary 2.22 show that 
the isomorphism of Rei; with Re;; can be exhibited by elements e;; and e;; in A 
satisfying the usual properties 


Cee = eli, CCNElL = @i1, Cli€il = C11; Celi = ii 


and also the properties é1; = x1; and é;; = xj;. Here a is shorthand for a +rad A. 
Define e;; = ej,e1;. Then e;; = €;1€1; = xi1%1; = xi;, and we readily check that 
{e;;} is a set of matrix units for A. 

By Proposition 2.26, A = M,(R) with R = e;;Aei,. From Corollary 2.20 
we know that e;; Ae;;/ rad(e; Aei,) = @1;(A/ rad A)é11, where €1; denotes the 
element e;; + rad A of A/rad A. Hence 


R/radR = €11(A/ rad A)eéeq, = 11M, (D)eé1, = D, 


and the proof is complete. 


Corollary 2.27. If A is a finite-dimensional algebra with identity over a field 
F and if A/rad A = M,,(F) as algebras, then there is a subalgebra S isomorphic 
to M,(F) such that A = S @ rad A as vector spaces. 


5. Wedderburn’s Main Theorem 103 


REMARKS. This corollary shows that Theorem 2.18 implies Theorem 2.17 
under the additional assumption that the algebra A of Theorem 2.17 satisfies 
A/rad A = M,,(F). It is not necessary to assume characteristic 0. 


PROOF. Suppose that A is a finite-dimensional algebra with identity over 
F such that A/rad A = M,(F). Then A is left Artinian, and Theorem 2.18 
produces a certain ring R with A = M,(R). Here Proposition 2.26 shows 
that R is isomorphic as a ring to e1; Aei; for a certain idempotent e;; in A. It 
follows that R is an algebra with identity over F’, necessarily finite-dimensional 
because A is finite-dimensional. The algebra R, according to Theorem 2.18, has 
R/radR = F. Therefore R = F @radR as F vector spaces. If we allow 
M,,(-) to be defined even for rings without identity, then we have F algebra 
isomorphisms 


A= M,(R) = M,(F @ rad R) = M,(F) © M, (rad R) 


in which the direct sums are understood to be direct sums of vector spaces. We 
shall show that 
rad(M,,(R)) = M,,(rad R), (*) 


and then the decomposition A = S$ @ rad A will have been proved with $ = 
M,(F). 

To prove («), let E;; be the member of M,,(R) that is 1 in the (i, j)™ place 
and is 0 elsewhere. Suppose that J is a two-sided ideal in M,(R). Let 7 CR 
be the set of all elements x,, for x € J. Ifr isin R, then rE), is a member of 
M,,(R), and the (1, 1)" entry of the element (r £,;)x of J is rx ,,. Thus rx, is 
in J. Similarly x;,r is in J, and J is a two-sided ideal in R. Let us see that 


J =M,(J). (#*) 


If x is in J, then so is Ej,;xE,; = x1, F;;, and hence J E;; is in J; taking sums 
over i and j shows that M,,(/) © J. In the reverse direction if x is in J, then so 
is Ey,x Ej, = x;;E\1, and hence x;; is in J; therefore J C M,(/). This proves 
(*«). Let us apply () with J = rad(M,(R)). The corresponding ideal J of R 
consists of all entries x1; of members x of J. Using Corollary 2.20, we obtain 


TE =E\,JE\; = Ey rad(M,(8)) Ey, = rad( £1, M,(R) E11) = rad(RE}1). 


Thus / = rad R. Taking M,,(- ) of both sides and applying (**), we arrive at (+). 
This completes the proof. 


Corollary 2.28. If A is a finite-dimensional associative algebra with identity 
over a field F andif A/rad A = M,,(F) x---x Mn, (F), then there is a subalgebra 
S of A isomorphic as an algebra to A/rad A such that A = S @ rad A as vector 
spaces. 


104 I. Wedderburn-Artin Ring Theory 


REMARKS. This corollary gives the conclusion of Theorem 2.17 under the 
additional assumption that the semisimple algebra A/ rad A over F is of the form 
A/radA = M,,(F) x --- x M,,(F). If F is algebraically closed, then the 
division rings D; in Theorem 2.2 are finite-dimensional division algebras over 
F and necessarily equal F’, as was observed in the discussion after Corollary 
2.3. Thus Theorem 2.2 shows that the additional assumption about the form of 
A/rad A is automatically satisfied if F is algebraically closed. In other words, 
Corollary 2.28 completes the proof of Theorem 2.17 if F is algebraically closed. 


PrRooF. For 1 < j < r, let x; be the identity matrix of M,,(F) when 
M,,(F) is regarded as a subalgebra of A/rad A. The elements x; are orthogonal 
idempotents in A/ rad A with sum 1, and Corollary 2.25 shows that they lift to 
orthogonal idempotents e; of A with sum 1. For each j, Corollary 2.20 shows that 
ej Ae; /tad(e; Ae;) = xj(A/rad A)x; = M,,(F). By Corollary 2.27, e; Ae; has 
a subalgebra S$; = M,,(F) with e; Ae; = S; © rad(e; Ae;) as vector spaces. Put 
S= Din S;, the direct sum being understood in the sense of vector spaces. The 
subalgebra S; has identity e;, and the product of e; with any other 5; is 0 because 
eve; = eje; =O wheni F j. Ifs = 1,5; ands’ = 7; 5; are two elements of S, 


then ss’ = (7; sie) (Lj e8}) = i,j sievejs; = Lj sjei8; = Li 875;- Hence 
S is a subalgebra. The element pa e; is a two-sided identity in S. 

Let us prove that SMrad A = 0. Ifs = i sj isin SMrad A, then s; = e;se; is 
in S; = e;Se; and is in e;(rad A)e;, which equals rad(e; Ae;) by Corollary 2.20. 
Since S; M rad(e; Ae;) = 0 by construction, s; = 0. Thus s = )), 5; = 0. 

Consequently SMrad A = 0. A count of dimensions gives dim $ = i dim $; 
— yi nj = dim(A/ rad A). Thus dim A = dim S+dim(rad A), and we conclude 
that A = S @ rad A as vector spaces. 


6. Semisimplicity and Tensor Products 


In this section we shall complete the proof of Wedderburn’s Main Theorem 
(Theorem 2.17). In the previous section we proved in Corollary 2.28 the special 
case in which A/ rad A is isomorphic to a product of full matrix rings over the 
base field F. This special case includes all cases of Theorem 2.17 in which F is 
algebraically closed. 

The idea for the general case is to make a change of rings by tensoring A with 
the algebraic closure of the underlying field F,, or at least with a large enough 
finite extension K of F for Corollary 2.28 to be applicable. That is, we first 
consider Ax = A@r K and (A/rad A) @- K inplace of A and A/ rad A. Inside 
Ax we can recognize (rad A) ®r K as a subalgebra defined over K, and we 
expect that it is rad Ax and that we can find a complementary subalgebra S over 


6. Semisimplicity and Tensor Products 105 


K; then the question is one of showing that S is of the form So @- K for some 
semisimple subalgebra So of A defined over F. The trouble with this style of 
argument is that the tensor product (A/ rad A) @¢ K need not be semisimple and 
there need not be a candidate for S$. Some question about separability of field 
extensions plays a role, as the following example shows, and the assumption of 
characteristic 0 will ensure this separability. 


EXAMPLE. We exhibit two extension fields K and L ofa base field F such that 
K @r L is not a semisimple algebra over F’. The field extensions are each 1-by-1 
matrix algebras over an extension field of F and hence are simple algebras, yet 
the tensor product is not semisimple. Fix a prime field F,,, and let F = F,(x?) be 
a simple transcendental extension of F,. Define K = L = F,(x) = F( RIL? ), 
Both K and L are field extensions of F of degree p. Thus K @- L is a finite- 
dimensional commutative algebra with identity over F’, by the construction in 
Proposition 10.24 of Basic Algebra. The element z= x @1—1@xinK @rL 
is nonzero but has z? = x? @1—1@x? =x? @1—x? @1 =O, the next-to-last 
equality following because x? lies in the base field F. Consequently K @- L has 
a nonzero nilpotent element. If K @¢ L were semisimple, Theorem 2.2 would 
show that it was the direct product of fields, and it could not have any nonzero 
nilpotent elements. We conclude that K @  L is not a semisimple algebra. 


Proposition 2.29. Let F be a field, let K = F(a) be a simple algebraic 
extension, let g(X) be the minimal polynomial of a over F’, and let L be another 
field extension of F. Then 


(a) K @r L = L[X]/(g(X)) as associative algebras over L, 
(b) K @f L is a semisimple algebra if the polynomial g(X) is separable. 


REMARKS. Proposition 10.24 of Basic Algebra shows that the tensor product 
A ®r B of two associative algebras with identity over F has a unique associative 
algebra structure such that (a; ® bi) (a2 ® b2) = a1az ® bbz. Problem 8 at the 
end of Chapter X shows that if B is an extension field of F, then A @- B is in fact 
an associative algebra with identity over B, the multiplication by b € B being 
given by the mapping 1 © (left by b). 


PROOF. For (a), letn = [K : F]. Formthe F bilinear mapping of F[X]x L into 
L[X] given by (P(X), 2) + €P(X). Corresponding to this F bilinear mapping 
is a unique F linear map g : F[X] @r L — L[X] carrying P(X) @ £ to P(X) 
for P(X) € F[X]andé € L. The F vector space F[X]@- L is an L vector space 
with multiplication by £9 € L given by the linear mapping 1 ® (left by £9). Since 
g(( ® (left by £9))(P(X) @£)) = (of P(X) = lLop(P(X) @2)), gis L linear. In 
addition, p((P(X)@£)(Q(X) @L')) = G(P(X) Q(X) @LE) = LE'P(X) Q(X) = 
o(P(X) @ £)9(Q(X) ® £’), and therefore gy is an algebra homomorphism. 


106 I. Wedderburn-Artin Ring Theory 


We follow ¢g with the quotient homomorphism w : L[X] — L[X]/(g(X)), 
and the composition w¢ is 0 on the ideal (g(x)) @r L of F[X] @ pr L. Therefore 
wg descends to a homomorphism (F [X]/(g(X))) @r L > L[X]/(g(X)), hence 
to a homomorphism n : K @r L — L[X]/(g(X)). Since g and w are onto, so 
is 7. 

It is enough to prove that 7 is one-one. Thus suppose that n( >); ki ® i) =0 
with all k; in K, all 2; in L, and the @; linearly independent over F. Write 
k; = P)(X)+(g(X)) with deg P;(X) <n whenever P; 4 0. Then >); €; P(X) = 
0 mod g(X). Since g(X) has degree n and each nonzero P;(X) has degree at 
most n, >>; €;P;(X) = 0. Write P(X) = par cijX! with each c;; in F. Then 
pa Ce £;c;j) X! = 0, and )°; €;c;; = 0 for all j. Since the ¢; are linearly 
independent over F’, cj; = 0 foralli and j. Thus k; = 0 foralli, }°; ki @¢; =0, 
and 7 is one-one. This proves (a). 

For (b), factor g(X) over L as g1(X)-+- 9m(X) for polynomials g;(X) irre- 
ducible over L. Since the separability of g forces g;,..., 8» to be relatively 
prime in pairs, the Chinese Remainder Theorem implies that 


L[X]/(81(X) +++ 8m(X)) = LLX]/(g1(X)) x +++ x LEXT/(8m(X)). 


Each L[X]/(g;(X)) is a field, and thus LLX]/(g(X)) is exhibited as a product of 
fields and is semisimple. 


Corollary 2.30. Let F bea field, let K bea finite separable algebraic extension 
of F, and let L be another field extension of F. Then the algebra K @p L is 
semisimple. 


REMARKS. The condition of separability of the extension K/F is automatic 
in characteristic 0. The two field extensions K and L in the example before 
Proposition 2.29 both failed to be separable extensions of the base field F. 


PRooF. The Theorem of the Primitive Element (Theorem 9.34 of Basic Al- 
gebra) shows that K/F is a simple extension, say with K = F(a). Since this 
extension is assumed separable, the minimal polynomial over F of any element of 
K is a separable polynomial. The hypotheses of Proposition 2.29b are therefore 
satisfied, and K @ F L is semisimple. 


Proposition 2.31. Suppose that A and B are algebras with identity over a field 
F, that B is simple, and that B has center F. Then the two-sided ideals of the 
tensor-product algebra A ® B are all subsets J ®- B such that J is a two-sided 
ideal of A. 


PROOF. The set J @ B is atwo-sided ideal of A@  B, since (a@b)(i @b’) = 
ai @ bb’ and since a similar identity applies to multiplication in the other order. 


6. Semisimplicity and Tensor Products 107 


Conversely suppose that J is an ideal in A @p B. Let 1, be the identity of B, 
and define J = {a € A|a@ 1g € J}. Then / is a two-sided ideal of A, and we 
shall prove that J = I @p B. The easy inclusion is ] ®- B C J. For this, let 
i be in J and b bein B. Theni @ 1g isin J andl, @ bisin A @f B. Their 
product i ®@ b has to be in J, and thus ] @r BC J. 

For the reverse inclusion, take a basis {x;} of J over F and extend it to a basis 
of A by adjoining some vectors {y;}. It is enough to show that any finite sum 
i y; ® b; in J necessarily has all b; equal to 0. Arguing by contradiction, 
suppose that }°7_, y;, ® bj, is a nonzero sum in J with m as small as possible 
and in particular with all b;, nonzero. Let H be the subset of B defined by 


H= fe | ye yj, ® cj, € J for some m-tuple {c;,} C Bl. 
k=l 


The set H is a two-sided ideal of B containing the nonzero element b,, of B. 
Since B is simple by assumption, H = B. Thus 1g is in H. Therefore some 
element 


m 
yj, @ lp + 2 Yin @ Cj 


isin J. Letb € B be arbitrary. Multiplying the displayed element on the left and 
right by 1,4 ® b and subtracting the results shows that 


Vip @ (ej, — Cp) +++ + Vin @ (BCjn — Cin, B) 


is in J. Since m was chosen to be minimal, this element must be 0 for all choices 
of b. Then all coefficients are 0, and the conclusion is that all coefficients c;, are 
in the center of B, which is F by assumption. Consequently we can rewrite our 
element of J as 


m m 
Yi @let+d. Vj, Che = Vii @lat+ Cie Vie Ol B = (Vj Hep Yt: + FC jy, Vin) Ol B- 


The definition of J shows that the factor yj, +c; yj, +:+:+c;, y;, inthe pure tensor 
on the right is in J. Since the y;’s form a basis of a vector-space complement to 
I, this vector must be 0. The linear independence of the y;’s over F forces each 
coefficient to be 0, and we have arrived at a contradiction because the coefficient 
of y;, is 1, not 0. 


Lemma 2.32. The center of a finite-dimensional simple algebra A over a field 
F is a field that is a finite extension of F’. 


PROOF. By Theorem 2.4, A = M,,(D) for some finite-dimensional division 
algebra D over F. Let Z be the center of A. By inspection this consists of the 
scalar matrices whose entries lie in the center of D. The center of D is a field. 
Hence Z is a field, necessarily a finite extension of F. 


108 I. Wedderburn-Artin Ring Theory 


Proposition 2.33. Let A be a finite-dimensional semisimple algebra over a 
field F of characteristic 0, and suppose that K is a field containing F. Then the 
algebra A @- K over K is semisimple. 


PROOF. Since the tensor product of a finite direct sum is the direct sum of tensor 
products, we may assume without loss of generality that A is simple. Lemma 2.32 
shows that the center Z of A is a finite extension field of F. By Corollary 2.30 
and the assumption that F has characteristic 0, the algebra Z ® K is semisimple. 
Being commutative, it must be of the form K; @--- ® Ks with each ideal K; 
equal to a field, by Theorem 2.2. 

Each ideal K; is a unital Z ® K module, hence is both a unital Z module and 
a unital K module. Thus we can regard each K; as an extension field of Z or of 
K, whichever we choose. First let us regard K; as an extension field of Z. Since 
K; has no nontrivial ideals and A has center Z, Proposition 2.31 shows that the 
Z algebra A @z K; is simple as a ring. 

Next let us regard K; as an extension field of K; since A is finite-dimensional 
over F, so is Z. Therefore Z ® - K is finite-dimensional over K, and K; is a 
finite extension of K. Hence A @z K; is a finite-dimensional algebra over K, 
and it is left Artinian as a ring. 

By Theorem 2.6, any left Artinian simple ring such as A @z K; is neces- 
sarily semisimple. Using the associativity formula for tensor products given in 
Proposition 10.22 of Basic Algebra, we obtain an isomorphism of rings 


A@r K =(A@z Z) @r K =A@z (Z @F K) 


~A@z(K,@-:-@K,) = B(A@z K)), 
j=l 


the summands being two-sided ideals in each case. Since each A @z Kj is a 
finite-dimensional simple algebra over K, A @- K is a semisimple algebra over 
K by Theorem 2.4. 


Let us digress for a moment, returning in Lemma 2.34 to the argument that 
leads to the proof of Theorem 2.17. In the next section we shall want to know 
circumstances under which we can draw the same conclusion as in Proposition 
2.33 without assuming that the characteristic is 0. Write the finite-dimensional 
semisimple algebra A as A = M,, (D1) x --+ x My, (D,), where each D, is a 
division algebra over F. Let Z,,..., Z, be the respective centers of the simple 
factors of A. Lemma 2.32 Shoes that each Z; is a finite extension field of F’. 
The proof of Proposition 2.33 appealed to Corollaty 2.30 to conclude from the 
condition characteristic 0 that Z; ®- K is semisimple. Instead, by rereading the 
statement of Corollary 2.30, we see that it would have been enough for each Z; to 
be a finite separable field extension of F’,, even if F did not have charabienstle 0. 


6. Semisimplicity and Tensor Products 109 


Then the rest of the above proof goes through without change. Accordingly we 
define a finite-dimensional semisimple algebra A over a field F to be a separable 
semisimple algebra if the center of each simple component of A is a separable 
extension field of F’. In terms of this definition, we obtain the following improved 
version of Proposition 2.33. 


Proposition 2.33’. Let A be a finite-dimensional separable semisimple algebra 
over afield F, and suppose that K is a field containing F. Then the algebra A@ K 
over K is semisimple. 


Lemma 2.34. Suppose that A is a finite-dimensional algebra with identity 
over a field F, and suppose that N is a nilpotent two-sided ideal of A such that 
the algebra A/N is semisimple. Then N = rad A. 


PROOF. The algebra A is left Artinian, being finite-dimensional. Since N 
is nilpotent, we must have N C rad A. The two-sided ideal (rad A)/N of the 
semisimple algebra A/WN is nilpotent and hence must be 0. Therefore N = rad A. 


PROOF OF THEOREM 2.17. Let A be the given finite-dimensional algebra of the 
field F of characteristic 0, and write N forrad A and A for A/N. For any extension 
field K of F, we write Ax = A@r K, Nx = N @p K, and Ax = A@,p K. 

For most of the proof, we shall treat the special case that N* = 0. Let 
F be an algebraic closure of F. Then Ay = A@r F = (A/N) @r F = 
(A @r F)/(N @F F) = Ag/Np. Proposition 2.33 shows that Az = A @rF F is 
a semisimple algebra over F,, and the claim is that the two-sided ideal Nz of Az 
is nilpotent. In fact, any element of Nj is a finite sum of the form )°; (a; ® ci) 
with each q; in N and each c; in F. The product of this element with oF (a; ® ci) 
is}; ; (aia; ® cjc;), and this is 0 because the assumption N? = 0 implies that 
aja; = 0 for alli and j. Thus Nz = 0, and N= is nilpotent. 

Since A;/NF is semisimple and N= is nilpotent, Lemma 2.34 shows that 
NF = rad(A;). Corollary 2.28 (a special case of Theorem 2.17) is applicable to 
A; because F is algebraically closed, and it follows that there exists a subalgebra 
S of A; such that Ap = S® NF as vector spaces. Here Sisa product of finitely 
many algebras Mn, (F). The embedded matrix units ej; of 5 obtained from each 
Mn, (F) are members of AF = A®@r F and hence are of the form at @ cr 
where {x7}'"_, is a vector-space basis of A over F and each ¢ is in F. Only finitely 
many such c;’s are needed to handle all e;;’s, and we let K be a finite extension 
of F within F containing all of them. Let pp = 1, p1,..., Os be a vector-space 
basis of K over F. 


110 I. Wedderburn-Artin Ring Theory 


Relative to this K, we form Ax, Nx, and Ax as in the first paragraph of the 
proof. The same argument as with F shows that Ax = Ax/Nx is semisimple 
and that Nx is nilpotent. By Lemma 2.34, Nx = rad Ax. The formulas for the 
e;; 8 in the previous paragraph are valid in Ax and give us a system of matrix 
units. As in the previous paragraph, Corollary 2.28 produces a subalgebra S of 
Ax isomorphic to some M,,(K) x --- x My,(K) such that Ax = S @ Nx as 
vector spaces. 

In the basis {x;}"_, of A over F, we may assume that the first t vectors form 
a basis of N = rad A and the remaining vectors form a basis of a vector-space 
complement to N. We identify members a of A with members a @ | of Ax. 
With this identification in force, we decompose each basis vector x; fori > t 
according to Ax = S @ Nx as x; = y; — z; with y; € S and z; € Nx. Since the 
x;’s fori < t arein N C Nx, the vectors y; withi > t form a vector-space basis 
of S over K. Fori > t, write z; = Yi-0 zij ® o; with z;; in N. Then we have 


S 
Ye = hi Fp = Oa + Zio) + Ye zy @ 0; fori >t. 
j=l 


Put 
Ss 
xy = Xi + Zia and z= ij @ hj fori >t. 
j=l 


Phen ey Ua 4, is a basis of A over F. We shall show that So = 
y141 Fx; is a subalgebra of A, and then A = Sp @ N will be the required 
decomposition. 

Let x; and x; be given with i > t and j > t, and write 


a i = oy Veijxy + Vij with yi; € F and vjj € N. 


Substituting x/ = y; — z} and taking into account that Nx is an ideal in Ax, we 
have 
YiYj = VI VeijX, mod Nx = J? Yeij Ye mod Nx. 
k k 


Then yiyj = o, Veg Ye + ui; With each uj; € Nx. Since the y; are in S and S 
is a subalgebra, uj; = 0. Thus y;y; = )°, Yeij ye. Let us resubstitute into this 
equality from y; = x; + z;. Taking into account that zz’; = 0 because Vee; 
we obtain 

Hix, + XG +X, = » VeigXp + » VeijZp- 
Substituting from z; = Via zij ® pj gives 
Ss 


Ss Ss 
xx, @1+ i xiZj1 @ pp + i ZX; ® pr = » VeijX;, @ 1+ ud VijZe ® Pr. 


7. Skolem-Noether Theorem 111 
The coefficients of o9 = 1 must be equal, and therefore 
Foor ¥, 
hd ae » VkijXp- 


This equation shows that Sp is a subalgebra and completes the proof under the 
hypothesis that N* = 0. 

Now we drop the assumption that N* = 0. We shall prove the theorem 
by induction on dim r A, the base cases of the induction being dimr A = 0 
and dime A = 1, for which the theorem is immediate by inspection. For the 
inductive case, let A be given, and assume the theorem to be known for algebras of 
dimension < dim, A. If N? = 0, then we are done. Thus we may assume that the 
product ideal NV ? is nonzero and therefore that dimg(A/N*) < dime A. The First 
Isomorphism Theorem shows that (A/N*) iL (N/N?) = A/N = A. The quotient 
A/N is semisimple, and N/N? is a nilpotent ideal in A/N*. By Lemma 2.34, 
N/N? = rad(A/N7). The inductive hypothesis gives A/N = S,/N* ®@ N/N? 
for a subalgebra S; of A with S; D> N?. This means that A = S; + N and 
S10 N =N?. Here 


= dime S, + dime N — dime N? = dime S, + dimp(N/N?), 


and N/N Z 4 0 implies dimy S; < dimyr A. The Second Isomorphism Theorem 
gives A/N = (S,| +.N)/N = S1/(S}NN) = S|/N*. Thus S/N? is semisimple. 
Since N? is nilpotent, Lemma 2.34 shows that N ? = rad S$). The inductive 
hypothesis gives S; = S @ N? for a semisimple subalgebra S. Substituting 
into A = S; + N, we obtain A = (S @ N*) + N = S4+N. Meanwhile, 
SAN =(SNS))AN =SN(S, NN) =SON? =0. Therefore A= SON, 
and the induction is complete. 


7. Skolem—Noether Theorem 


In this section we begin an investigation of division algebras that are finite- 
dimensional over a given field F. A nonzero algebra A with identity over a field 
F will be called central if the center of A consists exactly of the scalar multiples 
of the identity, ie., if center(A) = F. Of special interest will be algebras with 
identity that are central simple, i.e., are both central and simple. 


Lemma 2.35. Let A and B be algebras with identity over a field F, and 
suppose that B is central. Then 


(a) the members of A®@- B commuting with 1 ® B are the members of A@ 1, 
(b) center(A @f B) = (center A) @ fF 1. 


112 I. Wedderburn-Artin Ring Theory 


ProorF. For (a), suppose that z = )°; a; ® bj commutes with 1 ® B and that 
the a; are linearly independent over F. If b is in B, then 


0=(1@b)z—z0 @D) =) qj @ (bb; — b;b), 


from which it follows that bb; — b;b = 0 for all b and alli. Since B is central, 
each b; is in F, and we can write z as 


2= 2g Ob) =) (Gib @ 1) = (Xi aibi) @ 1. 


In other words, z is of the form z =a ®@ 1. 

For (b), we need to prove the inclusion C. Thus let z be in center(A @ f B). 
By (a), z is of the form z = a @ 1 for some a € A. Now suppose that a’ is in A. 
Then 0 = (a’ @ 1)z — z(a’ @ 1) = (a’a — aa’) ® 1. Hence a’a = aa’, and we 
conclude that a is in center(A). 


Proposition 2.36. Let A and B be algebras with identity over a field F’, and 
suppose that B is central simple. Then 
(a) A simple implies A @- B simple, 
(b) A central simple implies A @f B central simple. 


PROOF. For (a), Proposition 2.31 shows that any two-sided ideal of A @ B is 
of the form J @ B for some two-sided ideal J of A. Since A is assumed simple, 
the only /’s are 0 and A. Thus the only ideals in A @p B are 0 and A @- B, and 
A @r B is simple. 

For (b), conclusion (a) shows that A @- B is simple. By Lemma 2.35b the 
center of A @r B is (center A) @ 1 = F1 @1= F(1 @1), and hence A ®p B 
is central. 


Corollary 2.37. If A and B are finite-dimensional semisimple algebras over a 
field F and at least one of them is separable over F’, then A @ B is semisimple. 


REMARK. The definition of separability of A or B appears between Proposition 
2.33 and Proposition 2.33’. 


PROOF. Without loss of generality, we may assume that A and B are simple. 
For definiteness let us say that A is the given separable algebra over F. Let 
K =center(B). Lemma 2.32 shows that K is a field, and associativity of tensor 
products allows us to write 


A@pB=A@rpr (K @x B)=(A@r K) @x B. 


Here A ® ¢ K is semisimple by Proposition 2.33’, and B is central simple over 
K. Thus Proposition 2.36a applies and shows that (A @- K) @x B is simple. 


7. Skolem-Noether Theorem 113 


Corollary 2.38. Let A be a central simple algebra of finite dimension n over 
a field F, and let A° be the opposite algebra. Then A @- A° = M,,(F). 


EXAMPLE. Take F = R and A = H, the algebra of quaternions. Then 
conjugation, with 1 +> 1 andi, j,k +» —i, —j, —k, is an antiautomorphism of 
HH. Consequently H° = H. Thecorollary says in this case that H@®pH = M,(R). 


PROOF. Let V be A considered as a vector space. For each ag € A, we associate 
the members /(ao) and r(ao) of Endr(V) given by /(ao)a = aga and r(ag)a = 
aay. Then /(ajaj) = I(ao)l(aj) and r(aoaj) = r(ag)r(ao), and it follows 
that? : A — End-(V) andr : A®° — End-(V) are algebra homomorphisms 
sending | to 1. 

Meanwhile, the map A x A° —> Endr(V) given by (a, a’) #& I(a)r(a’) is F 
bilinear and extends to an F linear map g : A®@p A° — Endp(V). Because of the 
homomorphism properties of / andr, the mapping ¢ is an algebra homomorphism 
sending 1 to 1. Proposition 2.36 shows that A ®  A°® is simple, and it follows 
that g is one-one. Since dimr(A @f A°) = (dimer A)* = dime Endf(V), gy is 
onto. 


Corollary 2.39. Let A be a central simple algebra of finite dimension d over 
a field F. Then d is the square of an integer. 


Proor. Let F be an algebraic closure of F. Proposition 2.36a shows that 
the algebra F @ A is simple, and its dimension over F is d. A simple finite- 
dimensional algebra over an algebraically closed field is a full matrix algebra over 
that field, and thus F @- A = M,,(F). Comparing dimensions over F, we see 
that d = n?. 


Corollary 2.40. If D is a division algebra finite-dimensional over its center 
F,, then dim, D is the square of an integer. 


PROOF. The algebra D is central simple over its center F’,, and the result is 
immediate from Corollary 2.39. 


Theorem 2.41 (Skolem—Noether Theorem). Let A be a finite-dimensional 
central simple algebra over the field F’, and let B be any simple algebra over F’. 
Suppose that f and g are F algebra homomorphisms of B into A carrying the 
identity to the identity. Then there exists an x € A with f(b) = xg(b)x~! for all 
bin B. 


PROOF. Let us observe that the homomorphisms f and g are one-one because 
B is simple, and the finite dimensionality of A therefore forces B to be finite- 
dimensional. 


114 I. Wedderburn-Artin Ring Theory 


We consider first the special case that A = M,,(F’) for some n. The homomor- 
phism f makes the space F” of column vectors into a unital left B module by 
the definition bv = f(b)v, and similarly the homomorphism g makes F’” into a 
unital left B module. Since B is finite-dimensional and simple, an argument given 
with Example 1 of semisimple rings in Section 2 shows that there is only one 
simple left B module up to isomorphism and that every unital left B module is a 
direct sum of copies of this simple left B module. Consequently the isomorphism 
classes of the B modules determined by f and g depend only on their dimension. 
The dimension is n in both cases, and hence there exists an invertible F linear 
map L: F” — F" such that Lf(b)v = g(b)Lv for all v € F”. If L is given by 
the matrix x~! in M,(F), then x! f(b) = g(b)x7!, and the theorem is therefore 
proved in this special case. 

For the general case we form the tensor products B @r A° and A @F A®. The 
maps f ® 1 and g @ 1 are F algebra homomorphisms between these algebras, 
B @F A® is simple by Proposition 2.36a, and Corollary 2.38 shows that A @- A® 
is isomorphic to M,,(F) for the integer n = dim A. The special case is applicable, 
and we obtain an invertible element X of A ®» A° such that 


(f @ I)(b@a’) = X(g @1I)(b@a’”)X~! forallb e Banda’ € A®. (x) 


Taking b = 1, we see that 1 @ a® = X(1 @a°)X™! forall a® € A°. By Lemma 
2.35a, X liesin A@1, hence is of the form X = x@1 forsomex in A. Substituting 
for X in (*), we obtain f(b) = xg(b)x7! as required. 


Corollary 2.42. If A is a finite-dimensional central simple algebra over the 
field F’, then every F automorphism of A is inner in the sense of being given by 
conjugation by an invertible element of A. 


PROOF. This is the special case of Theorem 2.41 in which B = A and g is the 
identity map on B. 


8. Double Centralizer Theorem 


We saw in Corollary 2.40 that if D is a division algebra finite-dimensional over 
its center F, then dimy D is the square of an integer. In this section we shall 
prove a theorem from which we can conclude that the positive integer of which 
dim  D is the square is the dimension of any maximal subfield of D. We state the 
theorem, establish two lemmas, prove the theorem, and then derive two corollaries 
concerning maximal subfields of division algebras. 

If A is an algebra with identity and B is a subalgebra containing the identity, 
then the centralizer of B in A is the subalgebra of all members of A commuting 
with every element of B. 


8. Double Centralizer Theorem 115 


Theorem 2.43 (Double Centralizer Theorem). Let A be a finite-dimensional 
central simple algebra over a field F, let B be a simple subalgebra of A, and let 
C be the centralizer of B in A. Then C is simple, B is the centralizer of C in A, 
and (dimy B)(dimyr C) = dim y A. 


Lemma 2.44. Let A and A’ be algebras with identity over a field F’, let B and 
B' be subalgebras of them, and let C and C’ be the centralizers of B and B’ in A 
and A’, respectively. Then the centralizer of B @r B'in A @p A’ isC @r C’. 


Proor. Expand an element of A @; A’ for the moment as x = 0; a; @ a; with 
the elements a} linearly independent over F’. If x satisfies x(b @ 1) = (b @ 1)x 
for all b in B, then >); (aib — ba;) ® a! = 0. Since the a}’s are independent, 
ajb — ba; = O for all i, and each a; is in C. Thus the centralizer of B ® | is 
C @r A’. 

Rewriting x with the a;’s assumed independent, we see similarly that the 
centralizer of 1 @; B’ is A@p C’. Putting these conclusions together, we see that 


centralizer(B @- B’) C centralizer(B @r 1) MN centralizer(1 @- B’) 
=(C @r A)N(A SFC’) =C@rC’. 


The reverse inclusion, namely centralizer(B ®- B’) > C @ fF C’, is immediate, 
and the lemma follows. 


Lemma 2.45. Let B be a finite-dimensional simple algebra over a field F, and 
write V for the algebra B considered as a vector space. For b in B and v in V, 
define members /(b) and r(b) of Ende (V) by /(b)v = bu andr(b)v = vb. Then 
the centralizer in Endr(V) of /(B) is r(B). 


PROOF. Let K be the center of B. This is an extension field of F by Lemma 
2.32, and B is central simple over K. Let us see that any member a of Endr(V) 
that centralizes /(B) is actually in Endx(V). If c is in K, then c is in particular 
in B, and therefore al(c) = l(c)a. Applying this equality to v € V yields 
a(cv) = ca(v), and this equality for all c € K says that a is in Endx(V). 

Thus it is enough to show that the centralizer of /(B) in Endx(V) is r(B). 
We argue as in the proof of Corollary 2.38: The definitions of / and r make V 
into a unital left B module and a unital right B module, and the members of K 
operate consistently on either side of V because K lies in the center of B. The 
function (b, b’) +> 1(b)r(b’) is therefore K bilinear, and it extends to the tensor 
product B @x B° as an algebra homomorphism g : B @x B° +> Endx(V). The 
homomorphism ¢ is one-one, since Proposition 2.36a shows B® x B° tobe simple. 
The dimensional equality dimx(B @x B°) = (dimx B)* = dimx (Endx (V)) 
allows us to conclude that ¢ is onto, hence is an isomorphism. 


116 I. Wedderburn-Artin Ring Theory 


Lemma 2.35a shows that the centralizer of B@®x 1 in B @x B° is 1 @x B®. 
If this statement is translated from the context of B @x B° into the isomorphic 
context of Endx (V), then the centralizer of /(B) in Endg(V) is r(B), and we 
saw that this fact is sufficient to imply the lemma. 


PROOF OF THEOREM 2.43. Let V be the algebra B considered as a vector 
space over F’, and let /(B) and r(B) be the sets of those members of Endr(V) 
that are given by left multiplication and right multiplication by members of B. 
The algebra A is central simple by assumption, and End (V) is central simple, 
being isomorphic to M,,(F) for the integer n = dimr(V). By Proposition 2.36b, 
A @Fr Endf(V) is central simple. We define two algebra homomorphisms /f and 
g of B into A @f Endf(V) by f(b) =1(b) @ Land g(b) = 1 @/1(d). 

The Skolem—Noether Theorem (Theorem 2.41) produces an element x of 
A @p Endf(V) with f(b) = xg(b)x7! for all b € B. Hence 


B@rpl=x(1 @rl(B))x'. (*) 


Lemma 2.44 shows that the centralizer of B ®r 1 in A @p Endp(V) is 
C @® pr End-(V) and that the centralizer of 1 ®- 1(B) is A @pr r(B). From 
the latter identification the centralizer of x(1 @¢ 1(B))x7! is x(A @p r(B))x7!. 
Combining (*) with these computations of centralizers, we see that 


C ®p Endr(V) = x(A @e r(B))x". () 
The algebra A @r(B) is isomorphic to A ®- B°, which is simple by Proposition 


2.36a. Therefore C ®r End (V) is simple, and C has to be simple. 
Equating the dimensions of the two sides of («*) gives 


(dime C)(dimp B)* = (dim C)(dimp Ende (V)) = dime (C @p Endp(V)) 
= dimr(A @p r(B)) = (dimp A)(dimr B), 


and hence 


Finally the centralizer D of C contains B, and two applications of the dimensional 
equality gives 


(dimp D)(dimp C) = dimp A = (dimp C)(dimr B). 


Thus dimr D = dimer B, and we must have D = B. In other words, B is the 
centralizer of C. 


9. Wedderburn’s Theorem about Finite Division Rings 117 


Corollary 2.46. Let D be a central finite-dimensional division algebra over 
the field F. If K is any maximal subfield of D, then dimp D = (dimr K ie 


PROOF. Apply the Double Centralizer Theorem (Theorem 2.43) with A = 
D. Let Z(K) be the centralizer of the simple subalgebra K in D. Since K is 
commutative, K C Z(K). If ais in Z(K) but not K, then K (a) is a field in D 
properly containing K, in contradiction to the assumption that K is a maximal 
subfield of D. Hence K = Z(K). The dimensional equality in the theorem 
therefore gives dimr D = (dimy K)(dimpr Z(K)) = (dime Ky’. 


Corollary 2.47. Let A be a finite-dimensional central simple algebra over a 
field F’, and let K be a subfield of A. Then the following are equivalent: 
(a) K is its own centralizer, 
(b) dime A = (dime K)?, 
(c) K is amaximal commutative subalgebra of A. 


Proor. Let Z(K) be the centralizer of K in A. The Double Centralizer 
Theorem (Theorem 2.43) gives the equality 


dimp A = (dimp K)(dimp Z(K)). (x) 


If (a) holds, then Z(K) = K, and (+) yields (b). 
If (b) holds, then (*) and the equality dimz A = (dim K)? together imply 
that dimp Z(K) = dimr K. Since K is commutative, Z(K) > K. The equality 
of dimensions implies that Z(K) = K, and then (c) follows. 
If (c) holds, we start from the inclusion K C Z(K). If x is in Z(K) but 
not K, then K(x) is a field strictly larger than K, in contradiction to (c). Thus 
K = Z(K), and (a) holds. 


9. Wedderburn’s Theorem about Finite Division Rings 
The theorem of this section is as follows. 
Theorem 2.48 (Wedderburn). Every finite division ring is a field. 
The proof will be preceded by a lemma. 


Lemma 2.49. If G is a finite group and H is a proper subgroup, then 
Useg gHg™! does not exhaust G. 


118 I. Wedderburn-Artin Ring Theory 


PRooF. In the union |_J geG 8H g_!, the terms corresponding to g and to gh, for 
h in H, are the same because (gh)H(gh)~! = g(hHh7')g™! = gHg—!. Thus 
the union can be rewritten as J eu 8H g~', it being understood that only one g is 
used from each coset gH. From this rewritten form of the union, we see that the 
number of elements other than the identity in the union is 


<[G: H)(H|—1)=[G: H]|H|—[G: H]=|G|—[G: H] <|G|—1, 


and the lemma follows. 


PROOF OF THEOREM 2.48. Let D be a finite division ring, and let F be the 
center. Then F is a field, say of g elements. Maximal subfields of D certainly 
exist. Any such subfield K has dime D = (dime K ye by Corollary 2.46, and 
hence any two such subfields K and K’‘ are isomorphic. The Skolem—Noether 
Theorem (Theorem 2.41) shows that K’ = xKx7! for some invertible x in the 
group D~% of invertible elements of D. 

On the other hand, F and any element of D generate a subfield of D, and this 
subfield is contained in a maximal subfield. Consequently any element of D is 
contained in some such K’, and D = L),.-p« xKx~'. Discarding the element 0 
from both sides, we obtain DX = Up xK*x~'. Applying Lemma 2.49 to the 
group G = D% and the subgroup H = K”%, we see that K* cannot be a proper 
subgroup of D*. Therefore D = K, and D is commutative. 


10. Frobenius’s Theorem about Division Algebras over the Reals 


We conclude this chapter by bringing together our results to prove the following 
celebrated theorem of Frobenius. 


Theorem 2.50 (Frobenius). Up to R isomorphism the only finite-dimensional 
associative division algebras over R are the algebras R of reals, C of complex 
numbers, and H of quaternions. 


REMARKS. The text of this chapter has not produced any concrete examples 
of noncommutative division rings other than the quaternions. Problems 12-16 at 
the end of the chapter produce generalized quaternion algebras in which R can 
be replaced by many other fields; there are infinitely many nonisomorphic such 
examples when the field is Q. In addition, Problems 17—19 produce examples 
of central division algebras of dimension 9 over suitable base fields. The next 
chapter will give further insight into the construction of division algebras. 


10. Frobenius’s Theorem about Division Algebras over the Reals 119 


PRooF. Let D be such a division algebra, and let F be the center. Then 
F is a finite extension field of R and must be R or C, since C is algebraically 
closed. If F = C, then we have seen that D = C. Thus we may assume that 
center(D) = R. 

Let K be a maximal subfield of D (existence by finite dimensionality), and let 
n = dimpg K. Corollary 2.46 shows that dimg D = n*. Since K has to be R or 
C, n has to be 1 or 2. If n = 1, we obtain D = R. Thus we may assume that 
n=2,K =C, and dimp D = 4. 

The map f : K — D given by f(a + bi) = a — bi, where i is the member 
of K corresponding to /—1 in C, is an algebra homomorphism into a central 
simple algebra over R, and so is the map g : K — D givenby g(a+bi) =a+bi. 
By the Skolem—Noether Theorem (Theorem 2.41), there exists some x in D with 
x(a + bi)x~! =a — bi foralla andbinR 

This element x has the property that x? commutes with every element of K 
and must lie in K, by Corollary 2.47. Let us see that x? lies in center(D) = R. 
In fact, otherwise 1 and x? would generate K as an R algebra, and every member 
of D commuting with 1 and x* would commute with all of K; since x commutes 
with 1 and x”, x would have to commute with K, contradiction. Thus x? lies 
inR. 

If x? > 0, then x? =r? for some rr € R. The elements x and r together lie in 
some subfield K’ of D, and K’ has no zero divisors. Since (x —r)(x +r) =0 
within K’, we conclude that x = +r. Then x commutes with the maximal 
subfield K above, and we arrive at a contradiction. 

Thus x? < 0. Write x? = —y? for some y € R, and put j = y~!x. The 
equation x(a +bi)x~! =a—bi says that j(a +bi)j~! =a—biandin particular 
that jij~! = —i. Define k = ij. 

We have j? = y~*x* = —1. Hence k? = ijij = i(jij7!)j? = i(—i)(-1) = 
i? = —1. Thenijk = —1, andk = —1(7- G7!) = -1(-—f)(-i) = —ji; 
hence ij + ji = 0. 

Let us show that {1, i, j, k} is a linearly independent set over R. Certainly j is 
not an R linear combination of 1 andi. Ifk = a+ bi+cj forsomea, b,c € R, 
then squaring gives 


—l=R=a°4+B i? +c? j? + 2abi + 2acj + bc(ij + ji) 
=a? —b? — c? + 2abi + 2acj. 


Equating coefficients of 1, 7, and 7, we obtain —1 = a* —b* — c*, ab = 0, 
and ac = 0. We cannot have —1 = a”, and thus at least one of b and c is 
nonzero. Then a = 0, andij = k = bi+ cj. Left multiplication by i gives 
—j =—b+cij = —b+c(bi + cj); equating coefficients shows that b = 0. 
Hence ij = cj, and we arrive at the contradiction i = c € R. We conclude that 
{1, i, j, k} is linearly independent over R. 


120 I. Wedderburn-Artin Ring Theory 


To complete the proof that D is isomorphic to H, we have only to verify that 
{1, i, j, k} satisfies the usual multiplication table for H. We know that i 2 = J 2 = 
k? = —1, that k = ij, and that k = —ji. The last of these says that ji = —k. 
The other verifications are 

jk = jij = Gi DP = (ICD =i, 
kj =ijj =i(-1) =-i, 


ki = iji =i(fij)j = i(—Dj = j, 
ik =iij =(-Dj =—j, 


and the proof is complete. 


11. Problems 


In all the problems below, all algebras are assumed to be associative. 

1. Let G be a finite group, and let CG be its complex group algebra. Prove that 
CG is a semisimple ring, and identify the constituent matrix algebras that arise 
for CG in Theorem 2.2 in terms of the irreducible representations of G. 

2. Wedderburn’s Main Theorem (Theorem 2.17) decomposes finite-dimensional 
algebras A in characteristic 0 as A = S @ rad A for some subalgebra S. 
(a) What explicitly is a decomposition A = S @ rad A for the complex algebra 

CIX]/(X? +192? 
(b) Is the subalgebra S in (a) unique? Prove that it is, or give a counterexample. 
(c) Answer the same questions as for (a) and (b) in the case of the real algebra 
R[X]/(X? + 1)?. 

3. Let A and B be finite-dimensional algebras with identity over a field F, and 
suppose that B is central simple. Prove that rad(A @r B) = (rad A) @F B. 

Problems 4—7 concern commutative Artinian rings. Let R be such a ring. 

4. Prove that 
(a) R has only finitely many maximal ideals, 
(b) rad R is the set of all nilpotent elements in R, 
(c) R is semisimple if and only if it has no nonzero nilpotent elements, 
(d) R semisimple implies that R is the direct product of fields. 

5. Let e be an idempotent in R/rad R. Prove that the idempotent e € R in 
Proposition 2.23 with e = e + rad R is unique. 

6. Problem 4a shows that R has only finitely many maximal ideals. Let N be their 
product. Use Nakayama’s Lemma (Lemma 8.51 of Basic Algebra, restated in 
the present book on page xxv) to prove that N is a nilpotent ideal in R. 

7. Deduce from the previous problem that any prime ideal in R contains one of the 
finitely many maximal ideals, hence that every prime ideal in R is maximal. 


11. Problems 121 


Problems 8-11 concern triangular rings, which were introduced in an example after 
Proposition 2.5. The problems ask for verifications for some assertions that were 
made in that example without proof. The notation is as follows: R and S are rings 
with identity, and M is a unital (R, S) bimodule. Define a set A and operations of 
addition and multiplication symbolically by 


site rom rom \ —_ (rr rm'+ms’ 
0 s 0 ss’) \O ss’ ; 
8. Prove that the left ideals in A are of the form J; ® Jb, where Jy is a left ideal in 
S and J; is a left R submodule of R @ M containing MJy. (Educational note: 


Then similarly the right ideals in A are of the form J; ® J2, where J) is a right 
ideal in R and Jz is a right S submodule of M © S containing J; M.) 


9. (a) Prove that the ring A is left Noetherian if and only if R and S are left 
Noetherian and M satisfies the ascending chain condition for its left R 
submodules. 

(b) Prove that the ring A is right Noetherian if and only if R and S are right 
Noetherian and M satisfies the ascending chain condition for its right S 
submodules. (Educational note: By similar arguments the conclusions 
of (a) and (b) remain valid if “Noetherian” is replaced by “Artinian” and 
“ascending” is replaced by “descending.”) 


10. If A = (e .) is any ring such as (2 : in which S is a (commutative) Noe- 
therian integral domain with field of fractions R and if S # R, prove that A is 
left Noetherian and not right Noetherian, and A is neither left nor right Artinian. 


0S 
S C R and dims R is infinite, prove that A is left Noetherian and left Artinian, 


and A is neither right Noetherian nor right Artinian. 


ll. IfA= & a is a ring such as (oe o) in which R and S are fields with 


Problems 12-16 concern generalized quaternion algebras. Let F be a field of 
characteristic other than 2, let K be a quadratic extension field, and let o be the 
nontrivial element in the Galois group. The field K is necessarily of the form K = 
F(./m) for some nonsquare m € F, and the elements c of K for which o(c) = —c 
are the F multiples of ./m. Fix an element r 4 0 of F, and let A be the subset of 


: a b 
M>(K) given by ‘Gn i): 
12. (a) Prove that A is a 4-dimensional algebra over F’. 


(b) Prove that A is central simple by examining cx — xc for c = (oe 


vm) 


m 


when x 4 0 is in a two-sided ideal J and is not in K = (6 ae )| 


122 I. Wedderburn-Artin Ring Theory 


13. Prove that A is a division algebra if and only if r is not of the form Nx/r(c) for 
some c € K. Why must A be isomorphic to M2(F’) when A is not a division 
algebra? 


14. Prove that ifr andr’ are two members of F such thatr = r’Nx /F (c) for some c 
in K, then the algebra A associated to r is isomorphic to the algebra associated 
tor’. 


15. Let {1,7, j, k} be the F basis of A consisting of the matrices 


t= (51)- #=(8 Se) J= (5). t= (he ¥), 


Prove that these satisfy z= mi, y =ril, e= —rml,ij =k = —/ji, 
jk = —-ri = —kj, and ki = —mj = —ik. 

16. By going over the proof of Theorem 2.50 and using the relations of the previous 
problem, prove that every central simple algebra of dimension 4 over F is of the 
same kind as A for some quadratic extension K = F(./m) and some member 
r Z£O0of F. 


Problems 17-19 concern cyclic algebras, which were introduced by L. E. Dickson. 

These extend the theory of generalized quaternion algebras to other sizes of matrices. 

The analogy with the theory in Problems 12-16 is tightest when the size is a prime. 

For notational simplicity this set of problems asks about size 3. Let F be any field, and 

let K be a finite Galois extension of F with cyclic Galois group. It is assumed in these 

problems that K has degree 3 over F and that {1, 0, 07} is the Galois group. Fix an 
b 


a Cc 
elementr + 0 of F, and let A be the subset of M3(K) given by ( ro(c) o(a) o(b) ) 


ro?(b) ro7(c) o7(a) 


a 0 0 
Identifying a € K with the member (° a(a) 0 ) of A and letting j be the member 
0 0 o%(a) 


010 
(« ) ') of A allows one to view A as the set of all matrices a + bj + cj* with 
r00 


a,b,c € K. The element j satisfies jaj~! 


=o/(a) fora € K and ‘ig =r. 

17. Arguing as for Problem 12, show that A is an algebra over F and that it is central 
simple of dimension 9. 

18. Using the general theory, prove that A either is a division algebra over F or is 
isomorphic to M3(F), and that A = M3(F) if and only if there is a 3-dimensional 
vector subspace of A that is a left A submodule of A. (Educational note: This 
problem makes crucial use of the fact that the size 3 is a prime.) 

19. (a) Prove that ifr = Nx;r(d) for some d ¢€ K, then the 3-dimensional vector 

subspace K (1 + d~!j +d7!a(d)~'j”) of A isa left A submodule. 

(b) Prove that any 3-dimensional left K submodule of A is necessarily of the 
form K (ap + boj + co ay?) for some nonzero ao + bo j + co 7 i in A and that 
this left K submodule is a left A submodule only if there exists an element 
d € K with Nx;r(d) =r, day =ra(co), dbo = o (ao), and dco = a (bo). 


CHAPTER ITI 


Brauer Group 


Abstract. This chapter continues the study of finite-dimensional associative division algebras over 
a field F, with particular attention to those that are simple and have center F. Section 5 is a self- 
contained digression on cohomology of groups that is preparation for an application in Section 6 
and for a general treatment of homological algebra in Chapter IV. 


Section | introduces the Brauer group of F and the relative Brauer group of K/F, K being 
any finite extension field. The Brauer group B(F) is the abelian group of equivalence classes of 
finite-dimensional central simple algebras over F under a relation called Brauer equivalence. The 
inclusion F C K induces a group homomorphism B(F) — B(K), and the relative Brauer group 
B(K /F) is the kernel of this homomorphism. The members of the kernel are those classes such 
that the tensor product with K of any member of the class is isomorphic to some full matrix algebra 
M,,(K); such a class always has a representative A with dimr A = (dimr K )?. One proves that 
B(F) is the union of all B(K /F) as K ranges over all finite Galois extensions of F. 

Sections 2—3 establish a group isomorphism B(K /F) = H 2(Gal(K /F), K*) when K isa finite 
Galois extension of F. With these hypotheses on K and F’,, Section 2 introduces data called a 
factor set for each member of B(K /F). The data depend on some choices, and the effect of making 
different choices is to multiply the factor set by a “trivial factor set.” Passage to factor sets thereby 
yields a function from B(K/F) to the cohomology group H 2(Gal(K/F), K*). Section 3 shows 
how to construct a concrete central simple algebra over F from a factor set, and this construction 
is used to show that the function from B(K /F) to H?(Gal(K/F), K™) constructed in Section 2 is 
one-one onto. An additional argument shows that this function in fact is a group isomorphism. 

Section 4 proves under the same hypotheses that H!(Gal(K/F), K*) = 0, and a corollary 
makes this result concrete when the Galois group is cyclic. This result and the corollary are known 
as Hilbert’s Theorem 90. 

Section 5 is a self-contained digression on the cohomology of groups. If G is a group and ZG is 
its integral group ring, a standard resolution of Z by free ZG modules is constructed in the category 
of all unital left ZG modules. This has the property that if M is an abelian group on which G acts 
by automorphisms, then the groups H”(G, M) result from applying the functor Homzg(-, M) to 
the members of this resolution, dropping the term Homzg(Z, M), and taking the cohomology of 
the resulting complex. Section 5 goes on to show that the groups H”(G, M) arise whenever this 
construction is applied to any free resolution of Z, not necessarily the standard one. This section 
serves as a prerequisite for Section 6 and as motivational background for Chapter IV. 

Section 6 applies the result of Section 5 in the case that G is finite cyclic, producing a nonstandard 
free resolution of Z in this case. From this alternative free resolution, one obtains a rather explicit 
formula for H*(G, M) whenever G is finite cyclic. Application to the case that G is the Galois group 
Gal(K /F) for a finite Galois extension gives the explicit formula B(K/F) = F* /NxjF(K *) for 
the relative Brauer group when the Galois group is cyclic. 


123 


124 II. Brauer Group 


1. Definition and Examples, Relative Brauer Group 


The “Brauer group” of a field allows one to work with the set of all isomorphism 
classes of finite-dimensional central division algebras over the field. The core 
theory in principle reduces the study of all such division algebras to questions in 
the cohomology theory of groups. The latter theory was introduced in Chapter VII 
of Basic Algebra and will be developed further in the present chapter and the next. 

Let F be a field. Theorem 2.4 shows that every finite-dimensional central 
simple algebra A over F is of the form A = M,,(D) for some uniquely determined 
integer n > 1 and some finite-dimensional central division algebra D over F that 
is uniquely determined up to F isomorphism. We can introduce an equivalence 
relation for finite-dimensional central division algebras over F that exactly mir- 
rors the relation of F isomorphism of the underlying finite-dimensional central 
division algebras. Specifically if A = M,(D) and A’ = M,,(D’) are two such 
central simple algebras for the same F such that D = D’, then we say that A 
is Brauer equivalent to A’, and we write A ~ A’. It is immediate from the 
definition that “Brauer equivalent” is an equivalence relation. We shall introduce 
an abelian-group structure into the set of Brauer equivalence classes, hence into 
the set of isomorphism classes of central finite-dimensional division algebras 
over F. 


Proposition 10.24 of Basic Algebra gives the definition of the tensor product 
of two F algebras! over F, and this operation is associative, up to canonical 
isomorphism, by Proposition 10.22. It is also commutative, up to canonical 
isomorphism. In fact, if A and B are given algebras over F,, then the canonical 
vector-space isomorphism g : A@r B > B@,r Ais given by g(a @b) = b@a. 
If a; ® bj and az ® bz are given, then the computation 


pa ® bi )~(az ®@ bz) = (b; ® ay) (bz ® az) = by bz ®@ ajay 
= 9(a1az ® bib) = g((a1 @ bi) (az ® by)) 


shows that g respects multiplication. Hence tensor product is commutative for 
algebras, up to canonical isomorphism. 


Lemma 3.1. If F is a field, then 


(a) M,(R) = R @r M,(F) for any algebra R with identity over F, 
(b) Mn(F) @¢ Mn(F) = Monn (F). 


PROOF. For (a), the F bilinear map (r, [a;;]) > [raj;j] of R x M,(F) into 


' All algebras in this chapter are understood to be associative. 


1. Definition and Examples, Relative Brauer Group 125 


M,,(R) has a unique linear extension ¢ to an F linear map of R @- M,,(F) into 
M,,(R). The map ¢ has 


o((r ® lai)’ ® [aj,1)) = grr’ ® [aij lla;;)) 
= rr'[ajj\la;,] 
=r[aijlr'la;] since each aj; is in F 
= 9( @r laijle(r’ ® [a;,)), 


and hence ¢ is an F algebra homomorphism. If {r;,} is a vector-space basis of R 
over F and if {£;;} is the usual basis of M,(F), then g(r; ® Eij) = rx Ei;, and it 
follows that ~ carries a vector-space basis onto a vector-space basis. Hence ¢ is 
one-one and onto. 

For (b), the result of (a) gives M,(F) @r M,(F) = M,(M,(F)), and the 
algebra on the right is isomorphic to the algebra Min) (F’) of matrices of size mn 
by the multiplication-in-blocks isomorphism. 


Proposition 3.2. For the field F,, the operation of tensor product on finite- 
dimensional central simple algebras over F descends to an operation on the set 
of Brauer equivalence classes of such algebras and makes this set into an abelian 
group. 

PROOF. The tensor product of two finite-dimensional algebras over F is again 
a finite-dimensional algebra, and Proposition 2.36 shows that the tensor product 
of two central simple algebras is again central simple. Hence tensor product is 
well defined as an operation on finite-dimensional central simple algebras over 
F. Let us see that tensor product is a Brauer class property. Thus suppose that 
A ~ A’ and B ~ B’, say with A = M,,(D), A! = Mn (D), B = M,(E), and 
B' = M,,(E). Since the tensor product of some M,(F) with an algebra over F, 
up to isomorphism, does not depend on the order of the two factors and since 
tensor product is associative up to isomorphism, Lemma 3.1 gives 


A@r B=M,(D) @r M,(E) = D @r My (F) Or M)(F) @r E 


= DOr Monn) (F) QrE= Monn) (F) @r DFE 
= Monn) (D QF E). 


Similarly A’ @¢ B’ = Muyy)(D @¢ E). Thus A @p B ~ A’ @e B’. 

We have observed that the tensor product operation on algebras over F is 
associative and commutative, up to canonical isomorphisms, and hence so is the 
product operation on Brauer equivalence classes. The class of the 1-dimensional 
algebra F is the identity, and the class of the opposite algebra A° is an inverse to 
the class of A because of the isomorphism A @ A° = M,,(F) given in Corollary 
2.38. 


126 Il. Brauer Group 


The abelian group of Brauer equivalence classes of finite-dimensional central 
simple algebras over F is called the Brauer group of F and is denoted by B(F). 
We use additive notation for its product operation. 


EXAMPLES ALREADY SETTLED IN CHAPTER IL. 
(1) If F is algebraically closed, then B(F’) = 0. 
(2) If F = R, then B(F) = Z/2Z by Frobenius’s Theorem (Theorem 2.50). 


(3) If F is a finite field, then B(F’) = 0 by Wedderburn’s Theorem about finite 
division rings (Theorem 2.48). 


The group structure for B(F’) given in Proposition 3.2 offers little help by 
itself in identifying the finite-dimensional division algebras over a particular field. 
Instead, the usual procedure for understanding B(F) is to isolate certain special 
subgroups of B(F’), known as “relative Brauer groups” and denoted by B(K/F), 
K being any finite extension of F. Under the assumption that K is a finite Galois 
extension of F', Theorem 3.14 below says that B(K/F’) is isomorphic to the 
cohomology group H?(G, N), where G is the finite group G = Gal(K/F) and 
N is the (abelian) multiplicative group K * of the field K. This cohomology group 
is in principle manageable. Corollary 3.9 below says that B(F) is the union over 
all finite Galois extensions K/F of B(K/F), and we therefore obtain a handle 
on B(F). 

If A is any finite-dimensional central simple algebra over F and if K/F is any 
field extension, then Proposition 2.36a shows that A @- K is simple as a ring, and 
Lemma 2.35b shows that A ®@- K has center K. Therefore A ®f K is a central 
simple algebra over K, and its Brauer equivalence class is a member of B(K). 

Let us see that this map of algebras A into B(K) depends only on the Brauer 
equivalence class of A in B(F). Thus suppose that A = M,,(D) and A’ = M,,(D) 
for some finite-dimensional central division algebra D over F. Lemma 3.1a gives 
us isomorphisms of F algebras 


A@pr K = M,,(D) @¢ K = (M,,(F) @F¢ D) @r K 
= Mn(F) @F (D @r K) = My (D @F K), 


and similarly A’ @- K = M,(D @rf K). In each case the left member of 
the isomorphism is a K algebra, with K contained in the center. Thus we can 
view each of our isomorphisms as isomorphisms of central simple K algebras. 
Since D @ f K is a finite-dimensional central simple K algebra, we know that 
D @r K = M,(E) for some finite-dimensional central division algebra E over 
K. Application of Lemma 3.1b allows us to continue the displayed isomorphisms 
as 
A @r K = My(D ®F K) = Mn(M,(E)) = Manr)(E). 


1. Definition and Examples, Relative Brauer Group 127 


Similarly we have A’ @r K = Mi,)(E). Thus A @- K and A’ @- K yield the 
same member of 6(K), and (-)®  K induces a well-defined function from 6(F) 
into B(K). 

The function from B(F) into B(K) is a group homomorphism. In fact, if A and 
B are finite-dimensional central simple over F,, then we have K isomorphisms 


(A @r K) @x (B@r K) =A@r (K @x (B @r K)) 
=A®@r (B @r K)=(A@,e B) @- K, 


and the map is indeed a group homomorphism. 

In addition, the resulting homomorphism satisfies the expected compatibility 
condition with respect to compositions. In more detail, if we have nested fields 
F CK CL, then the L isomorphisms 


(A @r K)@x L=A@p (K Ox L)=ABlEL 


show that the composition of tensoring with K over F, followed by tensoring 
with L over K, yields the same result as tensoring directly with L over F. 

We define the relative Brauer group 5(K /F) to be the kernel of the homo- 
morphism of B(F’) into BLK). The members of the group B(K / F) are the Brauer 
equivalence classes of finite-dimensional central simple F algebras A such that 
A @F K is F isomorphic to M,(K) for some n. We say that such algebras are 
split over K, that K splits such algebras, and that K is a splitting field for these 
algebras and their Brauer equivalence classes. 


Theorem 3.3. Let K/F be a finite extension of fields. Then K is a splitting 
field for a given member X of B(K/F) if and only if there exists an algebra A 
over F' in the Brauer equivalence class X containing a subfield K’ isomorphic to 
K such that dime A = (dimp K’)?. 


REMARKS. 

(1) The theory of the Brauer group makes repeated use of this result. Corollary 
2.47 shows that the subfield K’ of A is a maximal commutative subalgebra of A 
and in particular is a maximal subfield of A. 

(2) Observe that the field K is given in the theorem, and hence the integer n = 
dim, K is known. Then A must have dimension n”. The equality dime A = n? 
determines A up to F isomorphism. In fact, Theorem 2.4 shows that A = M,(D) 
for a central division algebra whose isomorphism class is determined by the class 
X. Then n* = dimp A = r? dime D, and r? = n?/dimp(D). So A is indeed 
determined up to F isomorphism. 

(3) In view of the previous remark, any class X in B(K /F) has a distinguished 
representative that is unique up to F isomorphism; the distinguished representa- 
tives of the members of b(K /F) for fixed K all have the same dimension. 


128 II. Brauer Group 


PROOF. Suppose that A is a central simple algebra in the Brauer equivalence 
class X containing a subfield K’ isomorphic to K suchthat dimp A = (dim, K’)’. 
We are to prove that K’ splits A. Write n for dim K’, so that dimp A = 
n>. Regard A as an n-dimensional K’ vector space with K’ acting by right 
multiplication on A. Define an F bilinear mapping f : A x K’ — Endx’(A) by 


f(a, c)(a’) = aa’c; the image f (a, c) is in Endx(A) because 
fla, c)(a'c') = aa'c'c = (aa'c)c! = (f(a, c) (a'))c’. 


Extend f without changing its name to an F linear mapping f : A @p K' > 
Endx/(A) such that f(a @ c)(a’) = aa’c. The mapping f is actually K’ linear 
because 


f(a ®cc)a') = f(a ®ce')(a’) = aa'cc = (f(a@cj(a))c’. 
Also, it respects multiplication, since 


fa®c(fid @c)(a")) = fa@®ca@a'c) =aad'a"c'c = aa'a'cc 
= f(aa' ®cc')(a") = f((a@cj(a’ @c))\(a"). 


Thus f is a homomorphism of K’ algebras. The domain A @ - K’ is central 
simple over K', as we saw when setting up the homomorphism B(F) > B(K), 
and therefore f is one-one. Since A@; K’ and Endx’(A) both have K’ dimension 
n’, f has to be onto. Thus f exhibits A @p K’ as isomorphic to a full matrix 
ring over K’, and K’ splits A. 

Conversely suppose that K is a splitting field for the members of the class X 
in B(F). Let D be a division algebra in the class X. Since B(K /F) is a group 
and therefore contains the inverse class D°, we must have D° @¢ K = Mmin(K) 
for the integer m such that dime D° = m?. Let us rewrite this K isomorphism as 
D° @r K = Endg(K”). The algebra End-(K”) is central simple over F,, and 
up to an isomorphism, it contains the K algebra D° @f K and hence also the F 
algebra D° @- F = D°. Let A be the centralizer of D° in Endp(K”). We shall 
prove that A has the required properties. 

The algebra A contains (center D°) ® K, which is a subfield K’ isomorphic 
to K because D® is central over F, and A is simple by the Double Centralizer 
Theorem (Theorem 2.43). The center of A matches the center of the centralizer 
of A, which is the center of D° by Theorem 2.43, which in turn is F. Thus A is 
central simple over F. Yet another application of Theorem 2.43 gives 


(dimp A)(dimp D°) = dimp Ende (K") = m? (dime K)°. (x) 


Since dimp D° = m7”, we see that dime A = (dime K)*. Thus the subfield K’ 
of A isomorphic to K has the required dimension. 


1. Definition and Examples, Relative Brauer Group 129 


To see that A is in the Brauer equivalence class X, start from the F bilinear 
map A x (D° @r F) — Endr(K”) given by (a,d @ 1) > ad, and form its 
F linear extension g : A @p (D° @r F) — Endr(K™). The map ¢ respects 
multiplication because the members of A commute with the members of D° @- F: 


g(a ® (d ®1))(y(a' @ (d’ ® 1))(v)) = GPa ® d @ I))(a'd'v) = ada'd'v 
= aa'dd'v = g(ad' ® (dd' ® 1))(v). 


Since A @f (D° @F F) is simple by Proposition 2.36, g is one-one. A look at 
(x) shows that 


dimp(A Fr (D° Fr F)) = (dimy A) (dime D°) = dim; End-(K”) 


and allows us to conclude that g is onto. Therefore A @- D° = Endr(K"”). 
Since End, (K"') is Brauer equivalent to F’, the Brauer equivalence class of A is 
the inverse of the class of D°. Hence the class of A equals the class of D, which 
is X. 


Corollary 3.4. If D is a finite-dimensional central division algebra over the 
field F’, then any maximal subfield K of D splits D. 


PROOF. This is the special case of Theorem 3.3 in which A = D. The formula 
for the dimensions holds by Corollary 2.47. 


Corollary 3.5. If F is a field, then the Brauer group B(F) is the union of all 
relative Brauer groups 6(K /F) as K ranges over all finite extensions of F’. 


REMARKS. This result is all very tidy but is not very useful, since we have no 
indication how to identify 6(K /F) for a general finite extension F. In Corollary 
3.9 below, we sharpen this result to make K range only over the finite Galois 
extensions of F’, and we shall see in Section 3 that B(K /F’) can be realized for 
such fields K in terms of the cohomology of groups. 


Proor. Any member of B(F') has some central division algebra D as a 
representative, and Corollary 3.4 identifies an extension field K of F that splits 
D, namely any maximal subfield of D. 


Corollary 3.6. Let D be a finite-dimensional central division algebra over a 
field F, and let dim D = n’. If K is a splitting field for D, then dimp K is a 
multiple of n. 


Proor. If K is a splitting field for D, then Theorem 3.3 says that there 
exists an integer r such that M,(D) contains a subfield K' isomorphic to K with 
dime M,(D) = (dime K’)*. Thus r2n* = (dime K)*, andrn = dime K. 


130 Il. Brauer Group 


Theorem 3.7 (Noether-Jacobson Theorem). If D is a noncommutative finite- 
dimensional central division algebra over the field F’, then there exists a member 
of D that is not in F and is separable over F’. 


REMARKS. Within a field extension K/F, we know from Corollary 9.31 of 
Basic Algebra that the subset of all elements of K that are separable over F is 
a subfield of K containing F. Consequently an equivalent formulation of the 
theorem is that D contains a nontrivial separable extension field of F. 


PROOF (Herstein). Arguing by contradiction, suppose that no element of D 
outside F is separable over F. Let the characteristic of F be p, necessarily 
nonzero. If a is any element of D not in F’,, then the assumed nonseparability 
implies that the minimal polynomial f(X) of a over F has f’(X) = 0, according 
to Proposition 9.27 of Basic Algebra. Hence f(X) = f\(X?) for some polyno- 
mial f;(X) in F(X]. In turn, the minimal polynomial of a? is f,(X), and if a? is 
not in F, then f,(X) = f2(X”) for some polynomial f.(X) in F[X]. Since the 
degree decreases at each step as we pass from f to f, from f; to f2, and so on, 
we conclude that a” is in F for some e. In short, each a in D has the property 
that there is some integer e > 0 depending on a such that a” is in F. 

In view of the assumption that D ¢ F and the argument that we have just 
seen, there exists an element a in D outside F such that a? is in F. Define a 
function d : D > D by d(x) = xa — ax. The function d is F linear, and it is 
not identically 0 because a is not in the center F of D. If r and / denote right 
and left multiplication, we can rewrite d as d(x) = (r(a) — I(a))(x). The linear 
maps r(a) and /(a) commute with each other, and thus the Binomial Theorem is 
applicable in computing d? (x) as 


d? (x) = (r(a) — I(a))?(@&) = (ra)? — 1(@)?)(x) = xa? — p*x =0, 


the last equality holding because a? is in F and is therefore central. Since d? is the 
zero function and d is not, there exist an integer s with 2 < s < pandanelement 
y in D with d*-'!y #4 0 and d’y = 0. Put x = d°~'y. Since x = d(d°~*y), the 
element w = d°~y has the property that x = wa — aw. The condition dx = 0 
says that xa = ax. Put x = au. The elements a and u commute because a and 
x commute. If we set c = wu !, then x = wa — aw = cua — acu, and hence 
a = xu! = cuau~!~ac. Since a and u commute, we obtaina = ca—ac. Right 
multiplying by a~! gives 1 = c — aca! and therefore c = 1 + aca~!. Raising 
both sides to the p® power gives ch = 1 + ac?’ a-. The first paragraph of the 
proof shows that there is some e’ > 0 for which cP is in F , and for this integer 


e’, we obtain the contradictory equation c”” = 1+ c? from the commutativity 
of a with F. This completes the proof. 


1. Definition and Examples, Relative Brauer Group 131 


Corollary 3.8. If D is a noncommutative finite-dimensional central division 
algebra over the field F and if K is a subfield of D that is separable over F’, then 
there exists a maximal subfield L of D containing K such that L is separable 
over fF’. 


PROOF. Because of the finite dimensionality, we may assume without loss 
of generality that K is not properly contained in any larger subfield of D that 
is separable over F. Arguing by contradiction, we may assume that K is not a 
maximal subfield of D. Let EF be the centralizer of K in D. This is a division 
algebra over F’. It is simple by the Double Centralizer Theorem (Theorem 2.43), 
and it contains K because K is commutative. Moreover, we know from Theorem 
2.43 that 


and that K is the centralizer of FE. The latter condition shows that the division 
algebra E is central simple over K. Since K is not a maximal subfield of D, 
Corollary 2.46 gives dimr D > (dimp K)*. Thus dime K < dim p E. Since E 
is central over K, E is noncommutative. 

Application of Theorem 3.7 produces an element x in E outside K that is 
separable over K. Let L be the subfield K(x) of E. Since K is a separable 
extension of F, the Theorem of the Primitive Element gives an element a of K 
such that K = F(a). Then L = F(a,x). The implication (b) implies (c) in 
Corollary 9.29 of Basic Algebra shows that if a is separable over F and x is 
separable over F(a), then a and x are both separable over F. The elements of L 
that are separable over F form a subfield of L, and we have just proved that this 
subfield properly contains K. This conclusion contradicts the assumption that K 
is a maximal separable extension of F within D, and the proof is complete. 


Corollary 3.9. If F is a field, then the Brauer group B(F’) is the union of 
all relative Brauer groups B(K /F’) as K ranges over all finite Galois extensions 
of F. 


REMARKS. This is the result of interest. Each B(K/F) with K as in the 
corollary will be seen to be given as an H? in the cohomology of groups, and this 
group is in principle manageable. Thus we obtain a handle on B(F). 


ProoF. If D is a central division algebra over F’,, then Corollaries 3.4 and 3.8 
together show that some finite separable extension K’ of F splits D. That is, the 
Brauer equivalence class of D lies in B(K'/F). Let us write K’ = F(a) by the 
Theorem of the Primitive Element. If f(X) is the minimal polynomial of a over F’, 
then every root of f(X) in an algebraic closure F of F containing K’ is separable 
over F. Let K be the subfield of F generated by all the roots. This is a finite 
normal extension, and Corollary 9.30 of Basic Algebra shows that it is a separable 


132 Il. Brauer Group 


extension. We have seen that the composition of the homomorphisms B(F’) > 
B(K’) and B(K") > B(K) is B(F) > B(K), and consequently B(K’/F) C 
B(K/F). Therefore the Brauer equivalence class of D lies in BUK /F). 


2. Factor Sets 


Throughout this section let K/F be a finite Galois extension of fields. Our 
objective is to construct a function from the relative Brauer group B(K /F’) into 
the cohomology group H?(Gal(K /F), K*). In Section 3 we shall prove that this 
function is a group isomorphism. 

We take as known the material in Chapter VII of Basic Algebra on cohomology 
of groups. For convenient reference we list the relevant formulas for cohomology 
in degree 2. If G is a group and N is an abelian group on which G acts by 
automorphisms, the group C?(G, N) of 2-cochains is the group of all functions 
a:GxG-— N, the group Z7(G, N) of 2-cocycles is the set of members f of 
C(G, N) such that 


u(f(v, w)) + flu, vw) = f(uv, w) + fu, v) for allu, v, w € G, 


the group B?(G, N) of 2-coboundaries is the set of members f of C?(G, N) of 
the form 


flu, v) = u(a(v)) — a(uv) + a(u) forsomea:G—> N, 
and the cohomology group H?(G, N) is the quotient 
H?(G, N) = Z7(G, N)/B°(G, N). 


Here it is understood that we are using additive notation for the group operation 
in N and that the action of u € G ona member n of N is denoted by u(n). 

In constructing the function from B(K /F) into H >(Gal(K /F), K*), we shall 
proceed in somewhat the same fashion as for the identification of group extensions 
with an H? that was carried out in Chapter VII of Basic Algebra. Namely we shall 
associate a “factor set” to some choices concerning a given finite-dimensional 
central simple algebra and see that this factor set is a cocyle. Then we shall show 
that the factor set for any set of choices for any Brauer-equivalent central simple 
algebra differs from this cocyle by a coboundary. The result will be the desired 
function from B(K /F) into H?(Gal(K/F), K*). 

Thus write G for Gal(K /F), fix a Brauer equivalence class X in B(K /F), and 
let A be acentral simple algebra in the class X meeting the conditions of Theorem 
3.3: A contains a subfield K’ isomorphic to K, and dimp A = (dimp K’)*. Write 
ct c’ for the isomorphism K > K’. 


2. Factor Sets 133 


Let o be an element of the Galois group G. Thenc & c’ andc b a(c)’ 
are two algebra homomorphisms of the simple algebra K into the central simple 
algebra A, and the Skolem—Noether Theorem (Theorem 2.41) says that they are 
related by an inner automorphism: 


1 


ote) =a for some x, € A. 


Some choice is involved in selecting x,, but the element x, is unique up to a 
factor from K’ on the right. In fact, if x, and y, both behave as in the boxed 
formula, then y> 'x, commutes with K’ and hence is in K’. Thus x, = yo co With 
cp in K’. 

The nonuniqueness can be expressed also in terms of a factor from K’ on the 
left. In fact, the boxed formula for c = co implies that x5 = (xo cox7 NGe Co - — 
oO (co)’y oO: 

At any rate, fix a choice of x, for allo € G, and let us examine the effect of 
composition. If o and t are in G, then 


1 1 


Xore'x,) = (ot)(c)’ =a(t(c)) = xet(c)'x5! = xe xzpe'xy x5}, 
Using the result of the previous paragraph, we see that x,, and x,x, are related 


by a factor from K’ on the left. Hence we can write 


XgXt = a(0,T)'Xor with a(o, Tt) € K*. 


If we examine the effect of composing three elements of G, we obtain a 
consistency condition that the functiona : G x G — K™ must satisfy. Namely, 
let p, 0, and t be in G, and let us compute x)x,x,; in two ways, taking advantage 
of the associativity in A. With one grouping, we obtain 


XpXoXr = (XpXo)Xr = a(p, O) XpaXe = a(p, a) a(po, E) Xoeies 
and with the other grouping, we have 


XpXoXr =Xp(XoXr) = Xpalo, ©) ee 


= pla(o, E)) NiNok = plato, t))'a(p, ET) pecs 


Therefore the function a : G x G > kK™ satisfies 


p(a(o, T))a(p, oT) = a(p, o)a(po, T). 


A function a : G x G > k™ satisfying the above boxed formula is called a 
factor set. From A, an isomorphism K — K’, and a choice of the elements x, 
for 0 € G, we have obtained a factor set. 


134 Il. Brauer Group 


Comparing this boxed formula with the formulas in the second paragraph of 
this section, we see that a factor set is exactly a member of Z?(Gal(K /F), K*) 
except that the boxed formula uses multiplicative notation for K * and the defi- 
nition of 2-cocyle uses additive notation. Thus we have associated a member of 
Z?(Gal(K /F), K*) to the triple consisting of A, an isomorphism K — K’, and 
a choice of the elements x, foro € G. 

With the extension K /F and the class X € B(K /F) fixed, let us see the effect 
on the factor set of making different choices. The algebra A lies in the Brauer 
equivalence class X and has dimy A = (dim; K)?. As we saw in the remarks 
with Theorem 3.3, A is determined up to isomorphism by these properties. 

Thus let us start from a different system of choices: an algebra B in the 
class X, an isomorphism K — K”, and elements y, for o € G such that 
o(c)” = yoc"y,'. Define the corresponding factor set b : G x G > K™ by 


Yor = bE, ty! Vee: 


We wish to relate a(o, tT) and b(o, t). We have just seen that A and B are 
isomorphic as algebras. Let g : A — B be an isomorphism. Then c b> c’ t> 
y(c’) andc + c” are two algebra homomorphisms of K into B, and the Skolem— 
Noether Theorem (Theorem 2.41) produces an element ¢ € B with 


c" =ty(c)t! for allc e K. 


Starting from the formula o(c)’ = x.c'x>!, apply g and conjugate by r to obtain 


o(c)" = 19(0(6))t! = (t—C%o)t "Je" (19%!) 


This equation says that t(x,)t—! serves the same purpose as y,, and therefore 


Yo = roset 


for some member c” of K” placed on the left. Substitution into the formula 
Yor = bo, T)" Vor gives 


uw 
Cc 


cl to(xe)t el te(x,)t7! = b(0, 1)" to(Xor)t 


If we substitute from the formula c” = tg(c’)t7! for all members of K” and then 
conjugate by t~! and apply g7!, we obtain 


C45 C—O; 7) Ce Xee: 
The left side equals 


€ OC) hehe = ca (cr)alo, t) Maes 


3. Crossed Products 135 


and comparison of this expression with the right side gives 
UG; T)'C 5 = 6.0 (C7) AEST) 


Passing from K’ back to K, we conclude that 


b(o, T) Cot = CoO (Cz) AG, T). 


This formula says that b is the product of a and the trivial factor set c : Gx G > 
K~* given by 
-1 


C(O, T) = CoO (Cz) CG, 


where o +> c, is some function from G to K*. Again referring to the second 
paragraph of this section and remembering that we are using multiplicative no- 
tation for K*, we see that the trivial factor sets are the 2-coboundaries, lying 
in B*(Gal(K/F), K*), in the same way that the general factor sets are the 
2-cocycles, lying in Z*(Gal(K/F), K*). We have thus proved the following 
proposition. 


Proposition 3.10. Let K be a finite Galois extension of the field F. For 
X in B(K/F), let A be an algebra in the Brauer equivalence class X with 
dimr A = (dimp K)*, let K — K’ be an isomorphism of K into A, and let 
{xo | o € Gal(K/F)} © A* bea set of elements such that o(c)’ = xgc'x7!. 
Then the passage from X to the factor set determined by the triple of data 
(A, KK’, {x,}) descends to a well-defined function from the abelian group 
B(K /F) to the abelian group H?(Gal(K/F), K*). 


3. Crossed Products 


In this section we continue to assume that K/F is a finite Galois extension of 
fields. We are going to show that the function B(K /F) > H?(Gal(K/F), K*) 
given in Proposition 3.10 is an isomorphism of groups. The homomorphism 
property comes last and is the hard part of the argument. In the meantime, 
we construct the inverse function by associating an algebra to each member of 
Z*(Gal(K /F), K*) and showing in Corollary 3.13 that the resulting function on 
Z*(Gal(K /F), K*) descends to an inverse function from H*(Gal(K /F), K*) 
into B(K/F). The algebra is called a “crossed product” and is produced in 
Proposition 3.12 below. Before either of these steps, we establish one more 
property of the system {x, | o € Gal(K /F)} of the previous section that has not 
needed mentioning until now. 


136 Il. Brauer Group 


Thus let a central simple algebra A be given with dime A = (dim, K)’, along 
with an isomorphism K — K’' denoted by c + c’. As in the previous section 
we choose x, € A* with 


o(c) =xe'x5! for allc € K. 
The corresponding factor set a(o, tT) has 
XgX_ = a(C,T)'Xor- 


We regard A as a vector space over K' with K’ acting by multiplication on the 
left. 


Lemma 3.11. With hypotheses as above, the set {x, | o € Gal(K/F)} isa 
vector-space basis of A over K’. 

PROOF. Let G = Gal(K/F). Since |G| = dime K = dimy K’ = dimx: A, it 
is enough to prove linear independence. Arguing by contradiction, assume that 
the set {x, | o € G} is linearly dependent. Choose a maximal subset J of G 
such that {x, | t € J} is linearly independent. For o not in J, we then have 


fea ae ie witha, € K. (*) 
tel 


Every c in K satisfies 


OC) so Saee = Y aac = Serle) ie: 


ted tes 
and thus x» = oie; a(c)!alt(c)'x;. Comparing this expansion with (*) 
shows that 
at / ! 
o(c)a,t(c) =a, fort € J. (3) 


Since x. 4 0, some a’, in the expansion (*) is nonzero. For this t, we can cancel 
a’, in (x) and obtain o(c)’ = t(c)’ for allc € K. Then o = 7, in contradiction 
to the fact that o is notin J. 


The linear independence in Lemma 3.11 allows us to read off the structure of 
A: as a K’ vector space, the algebra A is given by A = QD, coax jp) & ‘Xo, and 
the elements x, have the properties that 


Xgc! =a(c)'xo forc € K and XgXt = a(o,T)'Xor- 


Proposition 3.12 is motivated by these formulas, saying that we can reconstruct 
A from a given 2-cocycle a(o, T) in such a way that these formulas hold. 


3. Crossed Products 137 


Proposition 3.12. Let K/F be a finite Galois extension, and let a = a(o, T) 
be in Z*(Gal(K/F), K*). Then there exist a central simple algebra A over F 
with dimy A = (dim K)*, anisomorphism K —> K’ of K onto a subfield K’ of 
A, and a subset {x, € A | o € Gal(K/F)} such that 

(a) A= Do ecatK/F) K'x,, 

(b) xec’x>! = o(c) for all c in K, with c +> c’ denoting the isomorphism 
of K onto K’, 

(c) X¢xX, = a(o,T)'Xor. 


REMARKS. We write A = A(K,Gal(K/F),a) and call A the crossed- 
product algebra corresponding to the factor set a. The algebra A is completely 
determined by the given conditions, up to canonical isomorphism, since (a), (b), 
and (c) determine the entire multiplication table of A. 


Proor. Let G = Gal(K/F), form a set {xg | o € G}, and let A be the K 


vector space (free K module) with basis {x,}. Then A = @,.¢ Kx. Define a 
multiplication on K basis vectors in A by 
(cXg)(dXx;z) = co (d)a(o, T)Xors (*) 


and extend it to a multiplication on A by additivity. 
First we shall check that A is an associative F algebra with a(1, 1)~!x, as 
identity by making use of the cocycle property 


p(a(o, T))a(p, oT) =a(p,a)a(po, T). (>) 
For associativity, («) gives 
(bx) ((CXo(dxz)) = (bx) (co (d)a(o, T)Xor) 
= bp(c)(pa(d))p(a(a, T))a(p, OT) Xpor 


and 
((bxp)(cxXe) (dxr) = (bp(c)a(p, 7) Xpa)(dxr) 


= be(c)a(p, «) pa (d))a(pa, T)X por, 


and the right sides are equal by (**). To see that a(1, 1)~!x, is a two-sided 
identity, take 9p = o = 1 in (*) to get 1(a(1, t))aU/, tT) = ad, Dad, Tt). Since 
a takes values in K*, we can cancel and obtain 


a(i,t) =a(1, 1). (+) 
Thus (+) gives 


(a(1, 1)7!x1)(dxz) = a1, 1)! Lda, t)x, = dxz. 


138 Il. Brauer Group 


Similarly another specialization of («*) is o(a(1, 1))a(o, 1) = ato, l)ato, 1), 
from which we obtain 


o(adl, 1)) =a(o, 1). (+4) 
Thus (+) gives 


(exe )te 1) “x)= co GU. DY 1a; Dig = eis; 


and a(1, 1)~!x, is indeed a two-sided identity. We denote it by 1. Scalar multi- 
plication by r € F is understood to be the additive extension of r(cx,) = (rc)X_e 
for c € K, and the identities 


(r (cxz)) (dx,) =rca(d)a(o, T)Xer, 
(cxg)(r (dxz)) = co(rd)a(o, T)Xer =rca(d)a(o, T)Xer, 


r((cxo)(dxz)) =rca(d)a(o, T)Xor 


show that multiplication in A is F linear with respect to scalars, hence show that 
A is an algebra over F. 

Second we define K’ C A and an isomorphism K — K’. For b € K, we let 
b’ be the member of A given by b’ = b1 = b(a(1, 1)~!x;), and we let K’ be the 
image of K under b +> b’. The map b +> D’ certainly respects addition, and it 
respects multiplication because the identity 


(bial, I~! x1)(oa, Do7!x1) = biboa, 17" x1 


is immediate from (*). Hence K’ is a subfield of A. 
Third we prove properties (a), (b), and (c). For (a), we use (*«) and (+) to obtain 
the identity 


b'x5 = (ba(1, 1)7!x1)xo = ba(1, 1)7!a(1, 0) x5 = bx¢. (4) 


This identity shows that K’x, = Kx,, and (a) follows. From (+), we see 
also that x5 (bxg-1) = (1x¢)(bxg-1) = lo(b)a(o, o~!)x; and that (bxg-1)xXo = 
bo (1)a(o~!, 0) x1; thus x, has a right inverse in A and also a left inverse, hence a 
two-sided inverse. Consequently the statement of (b) is meaningful; for its proof 
we have only to observe that 


net xe = Gy eats 1) xp)x, Se @a@d. 1D)" e@1)x,) x; ' 


ote (oh 


=a(c)X,° x =O(C)%5e. 
the last three equalities following from (+7), (+), and the identity x,x7 Pr], 
For (c), we have 
XoXr =alo, T)Xor =a(o, T)\ tee, 


3. Crossed Products 139 


the second equality following from (+). 

Fourth we show that A is simple. Let J be a proper two-sided ideal in A, and 
let pg : A — A/I be the quotient homomorphism. Since 1 is not in J and since 
K’ is a subfield of A, we know that ker(y| ,,) = 0 and that g(K’) is a subfield of 
A/T. The field g(K’) acts on A/J by left multiplication and makes A/J into a 
y(K’) vector space. The members g(x,) of A// certainly span A/I over y(K’) 
because of (a), and the claim is that they are linearly independent. If so, then g 
is one-one, J equals 0, and A is simple. For the linear independence, we argue 
by contradiction in the same way as for Lemma 3.11. Suppose that J C Gisa 
maximal subset such that {y(x,) | t € J} is linearly independent over g(K’). 
For o not in J, we then have 


Q(X%o) = Di v(a,)er) witha, € K. (+4) 
ted 


Every c in K satisfies 


g(a (Cc) (Xo) = G(Xo) 9(C') = V GAL) gr) 9(c) = Vi gla etc) ear), 
tes tel 
and thus 
P(Xo) = Dre G(O(C)’) | pla g(t (c) er). 


Comparing this expansion with (+) shows that 
g(o(c)) ‘pa er(cy) =¢la,) — fort € J. (§) 


Since Xx, is invertible in A, y(x,) is invertible in A/J and cannot be 0. Therefore 
some g(a‘) in the expansion (++) is nonzero. For this t, we can cancel g(a‘) in 
(§) and obtain g(a (c)') = g(t (c)’) for all c € K. Since g is one-one on K’, we 
conclude that 0 = T, in contradiction to the fact that o is not in J. Therefore A 
is simple. 

Fifth we show that A has center F. Thus suppose that }°, c).x, is central. 
Commutativity with d'x, forces the two expressions 


(Se rele te = Se) aa = > ced) ae, 7) Xes 


and 


Ee Les) = GC ae) = 8 Tee) Oa ae 


= Vid te Vale, 2 ory 
Oo 


to be equal. Hence 


dt (Cr-1¢7) a(t, t lot) = C,a (d)a(o, T) for all d, o, T. (§§) 


140 II. Brauer Group 


Putting d = 1 in (§§) shows that t(c,-1,,)a(t, tT !oT) = cga(a, T). Substitut- 
ing from this equation into the left side of ($§) gives 


dcga(o, T) = Cga (d)a(o, T) for all d, 0, T. 


If cz #0, we see that o(d) = d for alld € K; thus c, #0 only foro = 1. For 
o = 1andd = 1, (88) reduces to 


t(c,)a(t, 1) =c,a(1, T). 
Taking into account (+) and (f+), we obtain 
t(cya(1, 1)) = c,ad, 1). 


Since t is arbitrary, this says that c,a(1, 1) is in F. Thus the central element is 
cx = cx, = cya, Ia, 1)~!x, = (c)a(1, 1))1 and is an F multiple of the 
identity. 

Since {x,} by definition is a basis of A over K, we have dimx A = |G| = 
dimy K. Multiplying this equation by dim K yields dimp A = (dime K)?. 
This completes the proof. 


Corollary 3.13. If K is a finite Galois extension of the field F,, then the map 
B(K/F) — H?(Gal(K/F), K*) defined via factor sets is one-one onto. 


PROOF. Put G = Gal(K/F). Ifa: Gx G— K™ isin Z(G, K*), then 
we can construct an algebra A via Proposition 3.12, and the claim is that the 
map a +> A descends to H?(G, K*) and is a two-sided inverse to the map from 
B(K/F) into H?(G, K*) given in Proposition 3.10. 

First we show that a +> A descends to H7(G, K*). Thus suppose that b is a 
second cocycle and is of the form b(o, tT) = a(o, T)C5o (C;)C,.; 1.e., represents 
the same member of H?(G, K*). Let B be the algebra constructed from b by 
Proposition 3.12, say with K mapping to K” C B viac & c” and with 

(a) B= Qoeg K" Yo for a subset {y,} of B, 

(b') yoo"ys! =o(c)”, 

(C) Yo¥r = BG, T)"Yor: 

Define g : A — B to be the additive extension of the function with y(c’x,) = 
m tl 


c"c, Yo. To check that g is an algebra homomorphism, we start from the 
formula (c’x,)(d'x,) = c'o (d)'a(o, T)’xg7 and apply ¢ to obtain 


o((c'xe)(d'xz)) = ea (d)"a(o, t)"ch Yor. 


3. Crossed Products 141 


Meanwhile, 


G(C'Xo)9(d' Xr) = (CCG Yo) (dct Ye) 
= "cg '0 d)"o(Cr)"'B(6, 0)" Yor 


= "ey ta (dy"o (cr) aa, 1)" C50 Cr)" Yor: 


Hence p((c’xa)(d'xz)) = 9(c'x,)g(d'x,), and g is an algebra homomorphism. 
Since ¢ carries K basis to K basis, g is an algebra isomorphism. 

Thus the map a +> A descends to a map from H?(G, K*) into B(K/F). 
Starting from a cocycle a in Z7(G, K*), we can construct A and elements x, by 
Proposition 3.12, we can apply Propositions 3.12b and 3.10 to the x,.’s to obtain 
another cocycle a in Z*(G, K *), and we can use Proposition 3.12c to see that 
a =. In the reverse direction if we start from an algebra A, make a set of 
choices, and form a factor set a by means of Proposition 3.10, then Proposition 
3.12 constructs an algebra A that has to be isomorphic to A because conditions 
(a) through (c) in Proposition 3.12 determine the same multiplication table for an 
algebra as was used in constructing the cocycle a. 


Theorem 3.14. If K is a finite Galois extension of the field F’, then the map 
B(K/F) > H?(Gal(K/F), K”~*) defined via factor sets is a group isomorphism. 


REMARKS. Put G = Gal(K/F). In view of Corollary 3.13, is enough to 
prove that the mapping Z7(G, K*) > B(K/F) of Proposition 3.12 is a group 
homomorphism. Thus let A, B, and C be the crossed-product algebras A = 
A(K,G,a), B = A(K,G,b), and C = A(K,G,ab). We are to prove that 
A ®F B is Brauer equivalent to C. Each of A, B, and C has F dimension 
(dim; K)?, and hence A @- B will not be isomorphic to C. Consequently we 
need to prove Brauer equivalence of two specific nonisomorphic algebras. This 
is the circumstance that makes the proof complicated. 


PROOF (Chase). Let G, a, b, A, B, and C be as in the remarks. We can regard 
A and B as vector spaces over K with K acting on the left in each case. We define 
an F vector space M to be the quotient of A @f B by the F vector subspace I 
generated by all vectors ca ® b —a ® cb witha € A,b € B,andc € K. We 
write M = A @x B for this quotient, even though more standard notation for it 
might be A° ®x B with A° as aright K module and B as a left K module. 

The subspace / is carried to itself by right multiplication by any member of 
the algebra A @ pf B and hence is a right ideal. The quotient M is therefore a 
unital right A ®- B module with (a @x b)(a’ @¢ b') = aa' ®x bb’ fora @x b 
in M anda’ @- b' in A @r B. 

We shall make the unital right A ®- B module M into a unital (C, A @Fr B) 
bimodule by introducing an action by C on the left. For this purpose let {u,}, {vc}, 


142 Il. Brauer Group 


and {w,} be the distinguished K bases of the algebras A, B, and C indexed by G 
and used to form A, B, and C from the 2-cocycles a, b, and ab. Given an element 
XW, inC with x € K, define xw, on A @f B to be (left by xu) ® (left by v,). 
Let us see that this operation carries the generators of J into 7. We have 


(xWo)(ca @F b)—(xWe)(a @F ch) = XUgCa OF Vab — XUgd @F Vecb 
= XO (C)Ugd OF Ve b—XUgd OF A(C)Ugb 
= 0(C)(XUga) @F (ved) 
— (XU) @F o(c)(veb), 


and the right side is indeed in J. Thus we obtain an operation of xw, on the left 
for A ®x B such that 


(xWo)(A@xK bb) =xXugd@xvzb forxe K, co €G,aEeA, DEB. (x) 


We extend this definition by additivity in such a way that all of C operates on the 
left for A @x B. 

The claim is that the additive extension (*) to C makes M = A @x B intoa 
unital left C module. What needs proof is that 1 acts as 1 and that 


((xwo)(ywr)) (a @x b) = (xwe)((ywr)(a @x b)). (4) 
The element | in C is a(1, 1)~'b(1, 1)~! wy, and we have 
(a1, DBC, I7'w1)(@ @x 6) =a(1, IY'bA, 1)" w1a @x vb 
=a(1,1)~!uja @x D(A, 1)7!v1b =a x b. 
Thus | acts as 1. For («*), the left side is 
(xa (y)a(o, T)DO, T)Wor)(a Bk b) = xa(y)aG, T)b(G, Tard WK Vorb, 
while the right side is 


(XWo) (Yura OK Urb) = XUg YUra OK Vorb = XO (Y)UgurAa OK VaVrd 


= xa (y)a(o, T)Ugra OK D(o, T)Ug zd. 


These are equal, since b(o, tT) is in K and therefore moves across the tensor- 
product sign. 

Thus M is a unital left C module. The left action by C certainly commutes 
with the right action by A @- B, and M is consequently a unital (C, A @- B) 
bimodule. Each member of A @- B therefore yields by its right action a member 
of the ring Endc(M), and we obtain a ring homomorphism of (A @Fr B)° into 
Endc(M). Since A @- B is a simple ring, this homomorphism is one-one. If we 


3. Crossed Products 143 


can prove that this homomorphism is onto, then we will have a ring isomorphism 
(A @pr B)°? = Endc(M), and the rest will be easy. 

To see that the homomorphism is onto, we shall calculate dimensions. Let 
n = dimp K. Then each of A, B, and C has F dimension n?, and we have 


dimp M = (dimp A)(dimp B)/(dimp K) = n?n?/n =n? = (dime C)n. 


Since the algebra C is simple, every unital left C module is semisimple and is in 
fact isomorphic to a multiple of a simple left C module V. The above dimensional 
equality says that if r is the integer such that C is isomorphic to r V as a left C 
module, then M is isomorphic to nr V. 

Let D° be the division algebra Endc(V). As in the proof of Wedderburn’s 
Theorem (Theorem 2.2), we know for each integer m that 


Endc (mV) = Mn(Endc(V)) = Mm(D°). (t) 
Taking m =r in (7) gives C° = Endc(rV) = M,(D?°). Hence 
C = M,(D), (+) 


and dimr C = r*dimp D. Since dimp C = (dimp K)* = n?’, we obtain 
dime D = n’/r?. Taking m = nr in (+) gives 


Endc(M) = Endc (nr V) = My, (D*), (4) 
and we therefore obtain 
dime Endc(M) = n?r? dime D = (n’r?)(n?/r?) = n'4. 
Since dimp(A @¢ B) =n’, we obtain dimp(A @- B)? = dimp Endc(M), and 
we conclude that the algebra homomorphism (A @F B)° — Endc(M) is onto. 
Thus it is an isomorphism, and A @r B = (Endc(M))?°. 


Combining this isomorphism with (+) shows that A@- B = M,,(D). In view 
of (++), A @r B is therefore Brauer equivalent to C. 


Corollary 3.15. If D is a finite-dimensional central division algebra of dimen- 
sion m” over a field F, then the m-fold tensor product of D with itself over F is 
a full matrix algebra over F’. 


PROOF. Corollary 3.9 produces a finite Galois extension K of F such that 
K splits D. Write G for Gal(K/F). In view of Theorems 3.3 and 2.4, there 
exists an integer / such that A = M,(D) contains a subfield K’ isomorphic to K 
with dime A = (dime K’)*. Changing notation, we may redefine K = K’. Let 


144 Il. Brauer Group 


n =dimp K. Thenn? = dimr A =? dime D = (Im)*, andn = Im. Following 
the construction of factor sets in Section 2 and using Lemma 3.11, we form a 
vector-space basis {xz | o € G} of A over K and a factor set {a(o, T)} such that 
XoXr = a(O,T)Xgz and o(c) = NOx! forallcin K. 

Example | of semisimple rings in Section II.2 shows that the left A module A 
is the direct sum of / isomorphic simple left A modules. Let V be one of these. 
Restricting the module structure of V from A to K makes V into a unital left K 
module, hence into a vector space over K. Then we have 


n2 = dime A =/dimp V = 1(dimg V)(dime K) = Indimg V, 


and dimg V = m. Let v),..., Um» be a K basis of V. For each x € A, define a 
matrix C(x) in M,,(K) by 


XUj = S C(x)ijU;- 
i=l 


For o and t in G, we compute xx; v; in two ways as 


XoX Uj = a(o, T)Xor Uj = a(o, T) s C(Xor)ij Vi (*) 
i=1 
and as 
XoX Uj = Xo ~~ C (Xz) Kj VE — ae O(C(X7) Kj) Xo Uk = eS O(C (Xz )Kj)C Xo )ikvi- 
k=1 k=1 i,k=1 


If we write o (C(x,)) for the result of applying o to each entry of C(x,), then we 
obtain 


Xoxety = Yo (Ciao (Clee) (4%) 
Comparing (x) and (**) leads to the matrix equation in M,,(K) given by 
a(o, T)C(Xor) = C(Xq)a(C (Xz). 
Putting c, = det C(x,) and taking the determinant of both sides yields 
a(o,T)"Cor = CoO (Cr). 
This equation shows that a(o, Tt)” is a trivial factor set. Applying Theorem 3.14, 


we see that the m'" power of the Brauer equivalence class of A is trivial. Since A 
is Brauer equivalent to D, the corollary follows. 


4. Hilbert’s Theorem 90 145 


Corollary 3.16. If F is any field, then every element of B(F) has finite order. 


PRrooF. If A is any central simple algebra over F’, then Theorem 2.4 shows 
that A = M;(D) for some integer / > 1 and some central division algebra D over 
F. Corollary 3.15 shows that the Brauer equivalence class of D has finite order 
in B(F). Since A is Brauer equivalent to D, the same thing is true for A. 


4. Hilbert’s Theorem 90 


Let K /F bea finite Galois extension of fields. Our interest in this section will be 
in the cohomology groups H4(Gal(K /F), K*) with g possibly different from 2. 
For q = 0, H°(G, N) is always the subgroup of elements of N fixed by every 
element of G. In the case of a Galois extension, the members of K * fixed by the 
Galois group are the nonzero elements of the base field F. Thus we have 


H°(Gal(K/F), KX) = F*. 
In addition, Theorem 3.14 has established an isomorphism 
H?(Gal(K/F), K*) = B(K/F), 


and thus we have already obtained some understanding of this group for g = 2. 

We shall examine H! in a moment, but first we take note of another fact about 
H*. Problem 16b at the end of Chapter VII of Basic Algebra shows that if G 
is a finite group and AN is an abelian group on which G acts by automorphisms, 
then every element of H4(G, N) for g > 0 has order dividing |G|. In particular, 
every element of H*(Gal(K /F), K*) has order dividing dime K whenever K is 
a finite Galois extension of F. Applying Theorem 3.14, we see that every member 
of B(K /F) has order dividing dimp K. In view of Corollary 3.9, this argument 
gives a new and shorter proof of the result of Corollary 3.16 that every member 
of B(F) has finite order. The estimate of the order via Corollary 3.15, however, 
is sharper than the estimate obtained via the shorter proof, and this distinction 
makes all the difference in Problem 12 at the end of the chapter. 

The result concerning H! and its important special case given as Corollary 
3.18 below are known as Hilbert’s Theorem 90. 


Theorem 3.17. If K/F is any finite Galois extension of fields, then 
H'(Gal(K/F), K*) =0. 


PRooF. Let G = Gal(K/F), putn = dimpr K,andenumerate G asoj,..., On. 
By the Theorem of the Primitive Element, we can write K = F(q@) for some @ in 
K,andthen {1, a, a7, ...,a”~'}isabasis of K over F. Form the n-by-n matrix M 


146 Il. Brauer Group 


with entries in K whose (i, j)" entry is Oj (a‘—!). This is a Vandermonde matrix, 
and Corollary 5.3 of Basic Algebra gives its determinant as Il ist [oj (a) —o;(a)]. 
This determinant cannot be 0, since oj(@) = 0; (a) implies oj(a*) = oj(a)* = 
o;(a)* = 6; (a) for all k and then oj(x) = o;(x) for all x. Hence the matrix M 
is nonsingular. 

Let f be a nonzero element in Z'(G, K*). Such a function f : G > K is 
nowhere vanishing and has f(ot) = f(o)o(f(t)) for all o and t in G. Since 
the matrix M is nonsingular, the nontrivial linear combination )),-¢ f(o)o 
cannot be 0 on all members of the basis {1,a@,a7,...,a@7~!}. Choose k with 
> o<g f(a)o(a*) = y £0. Applying t € G to this equation, we obtain 

t(y) = } t(f(o))to (a) = oe a eC 
OE 


oEG 


=f) > Hage l= fa) 7: 
oEG 


The equation f(t)~! = t(y)y~! shows that f~! is a coboundary, hence that f 
is a coboundary. 


Corollary 3.18. If K/F is a finite Galois extension with cyclic Galois group 
and if o is a generator of the Galois group, then every member x of K with 
Nx r(x) = 1 is of the form x = o(y)y~! for some y € K™. 


REMARKS. The instance of this corollary in which K is a quadratic number 
field and F is the field Q appears as Problem 27 at the end of Chapter I. In 
subsequent problems at the end of that chapter, Problem 27 plays a crucial role 
in showing that various possible definitions of genera are equivalent. 


PrRooF. Let G = {l,o,07,...,0”~'} be the Galois group, and define a 

function F : Z— K* by F(O) = 1 and 

F(k) =xo(x)o*(x)---o* (x) ~~ fork > 1. 
Then we have 

F(k +1) = xa(x)o*(x)---o0**!“!(x) 

— (xo (x)o? (x) + ol (x))o* (xo (x)o7(x) tee o'!(x)) 

= F(k)o“(F), (*) 
The condition that Nx ;-(x) = | is exactly the condition that F(n) = 1. Then 
Fk +n) = Ftk)o"(F()) = Fk) for all k, and it is meaningful to define 
a l-cochain f : G > K* in C\(G, K*) by f(o*) = F(k). Condition (*) 
implies that f (o*o') = f(o*)o*(f(o')), andhence f isacocyclein Z'(G, K*). 
Theorem 3.17 shows that f is acoboundary in B!(G, K*), necessarily satisfying 


f(t) = t(y)y! for some y € K* and all t € G. Taking t = o, we obtain 
x = f(o) =o(y)y~!, as required. 


5. Digression on Cohomology of Groups 147 


Our final result concerning H7(Gal(K /F), K*) for this chapter gives further 
information about the special case in which Gal(K / F) is cyclic, but now for gen- 
eral g. In combination with the study of crossed-product algebras, the case q = 2 
of this result provides a way of constructing new examples of noncommutative 
division algebras. A key step in the proof makes use of a fundamental general 
property concerning cohomology of groups, and we therefore digress in Section 5 
to establish this property. 


5. Digression on Cohomology of Groups 


This section develops general material about cohomology of groups. Although 
the earlier sections of this chapter are helpful for motivation, the results that we 
discuss in this section do not rely on any previous material in this volume. It 
will be assumed that the reader is familiar with the definitions of complexes and 
exact sequences in Chapter X of Basic Algebra, as well as with the application 
of tensor-product functors and Hom functors to exact sequences and complexes. 
The material in Chapter VII of Basic Algebra on cohomology of groups will be 
helpful as background, but it is unnecessary from a logical point of view. If R is 
a ring with identity, we denote by Cr the category of all unital left R modules. 

Let G be a group, not necessarily finite. We shall work with the integral group 
ring ZG of G. It has the universal mapping property that whenever G acts by 
automorphisms on an abelian group M, then the action by G on M extends to 
ZG in a unique way that makes M into a unital left ZG module. 

Here is a brief overview of what is to happen in this section: If G acts on 
the abelian group M by automorphisms, then the abelian group C”(G, M) of 
n-cochains is the set of functions into M from the n-fold product of G with itself, 
the operation being given by addition of the values of the functions. To define the 
cohomology group H"(G, M), one introduces suitable homomorphisms known 
as “coboundary maps” 6, : C"(G,M) > c"+!(G, M) and shows that the 
sequence 


n—1 


OCG a 2S Ce SS CG: 


of abelian groups and homomorphisms is a complex in the category Cz. Then 
it is meaningful to define H”(G, M) = (ker4,,)/(Gimage 4,_,) forn > 0 if we 
adopt the convention that image 5_, = 0. The first thing that we shall do in this 
section is to exhibit a certain exact sequence in the category Czg such that the 
above complex is obtained from it by application of the functor Homzc(-, M) 
and the dropping of one term of the form Homzg(Z, M). Except for a single 
term Z, the members of this exact sequence will all be free ZG modules, and the 


148 II. Brauer Group 


exact sequence will be called the “standard resolution of Z in the category CzG.” 
The exactness is proved in Theorem 3.20, and the application of Homzg(-, M) 
to it appears after the proof of the theorem. 

The next thing that we shall do is show that if the standard resolution of Z is 
changed to any exact sequence in Czg in such a way that the free ZG modules 
are replaced by other free ZG modules and the module Z is left unchanged, 
then application of Homzg(-, M) to the new exact sequence leads to canonically 
isomorphic cohomology groups. This result appears below as Theorem 3.31. 
In brief, the cohomology groups H"(G, M) can be computed starting from any 
“free resolution of Z” in the category Czg in place of the standard resolution. 

We begin by constructing the “standard resolution of Z.” For n > 0, let F, 
be the free abelian group with Z basis the set of all (n+1)-tuples (go, ..., 8n) 
with all g; € G. The group G acts on F,, by automorphisms, the action on the 
members of the Z basis being 


&(80, +--+ Bn) = (880, +++ B8n)- 


The universal mapping property of ZG then allows us to regard each F,, as a 
unital left ZG module. 


Lemma 3.19. For n > 0, the left ZG module F,, is a free ZG module with 
ZG basis consisting of all (2+1)-tuples (1, g1,..., 8n), i-e., all Z basis elements 
with go = 1. 


PROOF. The formula go(1, % kis Seek 8 8n) = (g0, £1,---, Zn) Shows that 
all members of the Z basis defining F;, are ZG images of the asserted ZG basis; 
hence the asserted ZG basis is a spanning set of F,, relative to ZG. Suppose 
that there are finitely many distinct members h; of G and finitely many distinct 
(n+1)-tuples (1, gj.1,.--, in), and members YS njjh; of ZG such that 


om (do ijhj) CL, gi, +5 in) = 0. 
j 


i 


Then do nij (hj, hjgir, -.., Aj8in) = 0. 

if 
Since the /;’s are distinct as j varies and the n-tuples (g; 1, ..., 8i,n) are distinct 
as i varies, the (n+1)-tuples (hj, hjgi1,...,hjgi,n) are distinct as the pair (i, j) 


varies. Thus the Z independence implies that n;; = 0 for alli and j. This proves 
the lemma. 


5. Digression on Cohomology of Groups 149 


For n > 1, we define 0,_; : F;, — F,—1 aS a function from the Z basis into 
Frnt by 


n 
eA Bie Bi) = CD Bp Gta Bas 
i=0 


where the symbol ~ indicates an expression to be omitted. We extend d,_| to all 
of F,, by the universal mapping property of free abelian groups. For g in G and for 
any Z generator x of F,,, it is evident that 0,_;(gx) = g(0,_1(x)). Since 0,_1 1s 
a homomorphism of abelian groups, the formula 0,_1(gx) = g(0n—1(x)) extends 
to all x’s in F,. Since G and Z generate ZG, we obtain 0,_1(7x) = r(On_-1(%)) 
forallr € ZG andallx € F,. In other words, each 0,1 is a ZG homomorphism. 

We shall make use of one additional ZG homomorphism. According to Lemma 
3.19, the ZG module Fo is free on the ZG basis {(1)}. Let us think of the group G 
as acting trivially by automorphisms on the abelian group Z. Under this action, 
Z becomes a ZG module. Define ¢ : Fp — Z to be the ZG homomorphism with 
e((1)) = 1. Then e((go)) = go(e(C(1)) = go: 1 = 1 forall go € G. The ZG 
homomorphism ¢ is called the augmentation map. 


Theorem 3.20. If G is any group, then the sequence 


a a Bn a é 
aL Ca La (eel a eens 8 >Z > 0 


of left unital ZG modules and ZG homomorphisms is exact. 


REMARKS. The displayed sequence is called the standard resolution of Z in 
the category Czg. The proof will be preceded by two lemmas. 


Lemma 3.21. The sequence 


On4+1 0 On-1 fe) tos 
~—S Foot —> FR >: > I >Z > 0 


in Czg is a complex, i.e., 0,10, = 0 forn > | and also ed = 0. 


PRoor. With the understanding that the symbol ~ indicates an expression to 
be omitted, we have 


150 Il. Brauer Group 


On—19n (0, Ss » 8n) = » (—1)'dn—1(go, eee ae ae -+8n) 
i=0 


1 


= Daa ye (SD Cisetce ace) 


j=0 


n . n . ne as 
ae teh) » (FU) Bie; Bawa Pip aves Ba) 
i=0 j=itl 


n i . . a utes 
= DED IY 0s os Byes os Bis s+ + Bn) 


n 
= is ely Boyes ds Bite Boos Bas 
i=0 j=i+1 


If we interchange the order of summation in the second double sum on the right, 
we see that the result equals the first double sum on the right. Thus the difference 
is 0. 

This handles all the consecutive compositions except for €d9. For this we have 
£80(80, 81) = €(81) — €(go) = 1-1 =0. 


Lemma 3.22. Fix sin G. Forn > 0, defineahomomorphismh, : F, > Fri 
of abelian groups to be the additive extension of the function with 


hn(Zo, * +5 &n) = (s, 80; eae »8n)s 


and define h_; : Z —> Fo by h_i(k) = k(s). Then 0,h, + hy—10,-1 = 1 for 
n > 1, and also d9h9 + h_je = 1. 


PROOF. On the Z basis of (n+1)-tuples in F;,, we have 
Onin (go; ooeg Bn) = On(s, 80; Sey Bn) 


(g0,--+58n) + >. (-1)*1(s, 80, «0.5 Bis +9 Bn) 


i=0 
and also 
n P a 
An—10n—1(80, Ltt Bn) _ » (—-1)'(s, 80, ake Ae 8i> Bg. rt Bn). 
i=0 
The sum of these is (go, ..., Zn), aS required. Also, 


Aoho(8o) = Ios, 80) = (Zo) — (S) and h_\e(go) = h_11 = (s). 


Thus d9ho(go) + h_-1€(g0) = (go), and d9hp + hye = 1. 


5. Digression on Cohomology of Groups 151 


PROOF OF THEOREM 3.20. Lemma 3.21 gives imaged, C kero,_; and 
image 09 © kere. For the reverse of the first inclusion, let x € F, be given 
with 0,-;x = 0 anda > 1. Then Lemma 3.22 gives x = d,hnx + hy—10n-1X. 
The second term on the right side is 0, and therefore x = 0, (A,X) is in image dp. 

For the reverse of the inclusion image 09 C kere, let x € Fo be given with 
ex = 0. Then Lemma 3.22 gives x = dghox + h_,ex. The second term on the 
right side is 0, and therefore x = d9(hox) is in image do. 


With the standard resolution of Z in Czg now known to be exact, we examine the 
effect of applying the functor Homzg(-, M) to it. This functor is contravariant 
and carries Czg to the category Cz of all abelian groups. On a unital left ZG 
module F’, this functor yields the abelian group Homzg(F, M). Ona Z module 
homomorphism g : F — F’, it yields the homomorphism 


Hom(g, 1) : Homzg(F’, M) > Homzc¢(F, M) 


of abelian groups given by Hom(g, 1)(W) = wo @ for Y € Homzc(F’, M). 
We know from Chapter X of Basic Algebra that this functor carries complexes to 
complexes but does not necessarily preserve exactness. 

Before applying Homzg(- , M) to the standard resolution of Z, it is customary 
to drop the term Z and the augmentation map, obtaining a modified sequence 


ei a age a 
~— Fat —> F, > --- > FR — 0 


that is still a complex in Czg. Let us define d, = Hom(d,, 1). Then the result of 
applying Homzg(-, M) to the modified complex is the complex 


dn 
0 —> Homzg(Fo, M) —@> ---Homzg(F,, M) “> Homzg(Fr41, M) “ts 


in Cz. To each gy in Homzcg(Fn, M), we associate f = P(g) in C"(G, M) by 
the definition 


S (81, ey Bn) _ oC, 81, 8182; see 81 “++ Bn). 
Any member g of Homzg(F;,, M) is determined by its values on (n+1)-tuples 
(1, 21,---; 8n), Since we can factor out the first entry of the argument of g and 
commute it past @, and it follows that the system of group homomorphisms 


©, : Homzg(Fn, M) > C"(G, M) 


is a system of isomorphisms of abelian groups. Let 


152 Il. Brauer Group 
bn 1 C"(G, M) — C"*'(G, M) 


be the map corresponding to d, : Homg(fn, M) — Homg(Fn41, M) under 
this system of isomorphisms, namely 5, = ®y+41 0d, © o7', We can calculate 
dn explicitly as follows: If f = ®,(@), then 6, f = (®n41dn®,!)(Pn)(Q) — 
®,+41d,g, and therefore 


(nf )(g1; cee » &n+1) = (dng), 81; 8182, . +581 . -* Sn41) 
= 9(On(1, 81, 8182, +--+, 21 °° * Bnti)) 
= (81, 81825 --+5 81° ** Bn4i) 


n , eres, 
t YS (-D)' (1, 81, .--, 81+ Bin + G1 Ong) 
i=l 


T (—1)""'9d, 81; eee » 81 = Bn) 
= gi(f (82, 83, +++» Snti)) 


at Dh) fp isewss Bite he nd) 
i=l 


+(-D"*! f(g, ..-5 Bn): 


Comparing this formula with the original formula defining 4, in Chapter VII of 
Basic Algebra, we get a match. That is, we have obtained the complex in Cz 
defining the usual groups H”(G, M) by applying Homzg(-, M) to the standard 
resolution of Z in Czg and implementing the system of isomorphisms ®,. In 
particular, we obtain a more conceptual proof than in Basic Algebra of the fact 
that the sequence 


C= Gi WSs 63 iG ws: CO aG a 


is a complex and that cohomology groups are therefore well defined. 
This completes the discussion of the first main point of the section as outlined 
in the overview at the beginning. Next, any exact sequence 
of 


On / a, F' dn 0 F! e! Z 0 
> Fanti a a > Ig > > 


in the category Czg in which all ZG modules F’ for n > 0 are free ZG modules 
is called a free resolution of Z in the category Czg. The second main point 
of the section is that if we apply the functor Homzg(-, M) to this sequence 
with Z dropped, then the consecutive quotients of kernels modulo images are 
canonically isomorphic to the cohomology groups H"(G, M) obtained above. 
Thus H”(G, M) can be computed from any free resolution of Z, and we are 


5. Digression on Cohomology of Groups 153 


not obliged to use the standard free resolution. This result is stated precisely as 
Theorem 3.31 below. 

By way of preparation, let us establish a slightly more general setting and work 
with it fora moment. Let Cr be the category of all unital left R modules, where R 
is any ring with identity. According to circumstances, a complex X in Cr might 
be written with decreasing indices as 


On+1 a, On-1 On—2 
xX: eS eS 


or with increasing indices as 


dn-2 dn-1 


d, dn+1 
xX: wee SX > Xn —> Xnai — 


Mathematically these complexes amount to the same thing: if we rename each 
Xx in the second complex as X_, and rename each dy as 0_,_1, then we obtain 
the first complex. However, it is convenient to allow both systems of indexing 
because of applications. 

For the first complex, which has decreasing indices, we define the n" homology 
of X, written H,,(X), by 


Hi, (X) = (ker 0,_1)/(Gimage 0,,). 


For the second complex, which has increasing indices, we define the n coho- 
mology of X, written H”(X), by 


H"(X) = (kerd,)/ (image d,_1). 


In both cases the integer n is called the degree. In either case the homology 
or cohomology is again a module in Cr. The condition that X be a complex is 
equivalent to the condition that the image of each incoming map be contained in 
the kernel of the corresponding outgoing map, and this is precisely the condition 
that the homology or cohomology be meaningful. Exactness at a particular 
module in one of the complexes is the statement that the image of the incoming 
map equals the kernel of the outgoing map. Thus the homology or cohomology 
of X measures the extent to which the complex X fails to be exact. 

Because the nature of the indexing of a complex is not mathematically sig- 
nificant, we will treat only the case of increasing indices for a while, and the 
modules associated to our complexes will therefore be cohomology modules. A 


154 Il. Brauer Group 


cochain map? between two complexes X and Y in the same category Cp is a 
system f = {f,} of R homomorphisms f, : X, — Y, such that the various 
squares commute in Figure 3.1. 


dy—2 dy—1 d dnt 
xX: pe See XM, “> Xn4t —— 
[ha [p [Jo 
d-2 nl dh Gnas 
Y: ep Ee YG “> Y, nis Yui SRE WES Jahn 


FIGURE 3.1. A cochain map f : X > Y. 


Proposition 3.23. A cochain map f : X — Y as in Figure 3.1 induces an R 
homomorphism on cohomology H”(X) — H”(Y) in each degree. 


PROOF. Suppose that x, is in kerd,, i.e., that d,(x,) = 0. The commutativity 
of the right square gives d)(fn(%n)) = fn+i(dn(Xn)) = 0, and hence fy, (xn) is 
in kerd’,. Suppose that x, is in image dy_1, i.e., that x, = dr—1(%n_1) for some 
Xn—1. The commutativity of the left square gives fi(%1) = fndn—1Qn-1) = 
d’_,(fn—1(Xn-1)), and hence f;, (x,) isinimage d)_,. Then it follows that f, lies ,; 
descends to the quotient (kerd,)/(imaged,_,), yielding a map of H”(X) into 
H"(Y). 


Suppose in the situation of Figure 3.1 that g = {g,} is a second cochain map 
of X into Y. We say that f is homotopic’ to g, written f ~ g, if there is a system 
h = {h,} of maps hy : X, > Yn—1 in Cr such that d’h + hd = f — g,ie., if 
d’_ hn +hnsidn = fn — Sn for all n. 


Proposition 3.24. In the situation of Figure 3.1 if f = {f,,} and g = {g,} are 
two cochain maps of X into Y andif f and g are homotopic, then f and g induce 
identical maps H"(X) — H"(Y) in each degree. 


Proor. Suppose that d,(x,) = 0. Then fi (%n) — 8n(%n) = d)_)(An(n)) + 
hnai(dn(Xn)) = d!_,(An(%n)) + 0 shows that the images of x, under f, and gy 
in Y,, differ by a member of imaged’. 


Now we bring free R modules into the discussion. 


>The analogous kind of system in which the complexes have decreasing indices is called a chain 
map. 

3An analogous definition is to be made in the case of two chain maps. If the maps of X are 
On : Xn+1 — Xp and the maps of Y are a : Yn41 — Yn, then we are to have hy : Xn > Yn+1 with 
a, An of hn—10n-1 = fn — 8n- 


5. Digression on Cohomology of Groups 155 


Proposition 3.25. For the diagram 


F a M ay 


: f [1 [n 


a’ a 
F’ > M ates: N’ 
in Cr, suppose that the top and bottom rows are exact at M and M’, suppose 
that the square on the right commutes, and suppose that F is a free R module. 
Then there exists an R homomorphism f : F — F’ that makes the left square 
commute. 


Proor. If x is a free generator of F, then 0 = f,0)0(x) = 0)(fdx). By 
exactness at M’, fdx lies in image(0’). Choose any y € F’ with 0’y = fax, and 
define f(x) to be this y. Then fdx = 0’ fx, and the left square commutes at x. 
The universal mapping property of free R modules says that f extends to an R 
homomorphism of F into F’, and the extension has fd = 0’ f, as required. 


Corollary 3.26. In the category Cza, if the rows of the diagram 


1 -1 
melee SD) Ge "> X, — > > Xo > Z > 0 
[Jo [ps [n [i 
a” 1 oe” Ons ov e! 
Aa "+ Y, meee ea > Z > 0 


are free resolutions and the vertical identity map 1 : Z — Z is given, then the 
remaining vertical maps, 


fo: Xo > Yo, shehsen tn Xn > Vn, Foti: Xn41 > Yn, ateeny 


can be constructed inductively from the right to make all the squares commute. 


REMARK. The resulting system f = {f,} is called a chain map over the 
identity map 1: Z — Z. 


PROOF. There is no harm in including a vertical 0 map at the right between 
the two 0 modules. Certainly the square whose verticals are the identity map 
1: Z— Zand the 0 map commutes. Proposition 3.25 is to be applied first to this 
square and the second square from the right (with vertical fo to be constructed and 
vertical 1 : Z — Z given) to construct fo, then to the second and third squares 
from the right to construct /|, and so on, inductively. 


156 Il. Brauer Group 


Proposition 3.27. For the diagram 


F u > F a > N 

[7 By aA [n 

oe 

F > fF > N 
a ay 


in Cr, suppose that the top and bottom rows are exact at F, that the left and right 
squares commute, that F and F are free R modules, and that h; : N de exists 
with f; — 0,h; vanishing on image(0,). Then there exists h : F — F such that 
dh + h,0, = f, and this property implies that f — dh vanishes on image(d). 


PROOF. If x is a free generator of F’, then f(x) —h, (0; (x)) is in ker(0,) because 
aC fx = hy 0x) = fix | 01h, 0\x = (fi _ 0,h1) (01x) and because fi — dh, 
vanishes on image(0,) by assumption. Therefore f (x)—/ (0; (x)) is inimage(0), 
and we can write f(x) — h;(0\(x)) = da for some a € F. Put h(x) =a. Then 
dhx = 0a = fx — h,0\x, and h has the required property on the generator x. 
The universal mapping property of the free R module F allows us to extend h to 
an R homomorphism h : F > F , and the extension satisfies dh = f — h10. 
Once h has this property, then necessarily (f — 0h)0 = (h101)0 = h1(010) = 0. 


Corollary 3.28. In the category Czg, if a free resolution X = {X,,} of Z anda 
chain map f = {f,} of X with itself are given such that the map from Z to itself 
is 0, then the chain map f is homotopic to the zero chain map g = {g,} with 
8n = 0 for all n. 


PROOF. We are given the diagram 


> Xy > > X] > Xo > Z > 0 
|. hy ly a le A, lo 
a a% e! 
> X), > > X > Xo > Z > 0 


in the category Czg with the two rows as free resolutions and all squares com- 
muting. We are to construct maps h, : X, — Xn41 with 0;hy +hy_10),_) = fh- 
Let h_z be the 0 map from the top 0 module to the bottom Z, and let h_; be the 0 
map from the top Z to the bottom Xo. Then 0/hy + hn—10)_, = fn is satisfied 
for n = —1 because the map f_, is the 0 map from Z to itself. Proposition 3.27 


then allows us to construct inductively first 4o, then 41, then 2, and so on. 


5. Digression on Cohomology of Groups 157 


Corollary 3.29. In the category Czg, if a free resolution X = {X,,} of Z anda 
chain map f = {f,} of X with itself are given such that the map from Z to itself 
is the identity 1, then the chain map f is homotopic to the identity chain map 
g = {gn} with g, = 1 forall n. 


PROOF. Apply Corollary 3.28 to f — 1. 


Corollary 3.30. In the category Czg, if two free resolutions X = {X,} of Z 
and Y = {Y,,} of Z are given and if two chain maps f : X — Y andg: Y > X 
are given such that the map from Z to itself in each case is the identity 1, then gf 
is homotopic to 1 and fg is homotopic to 1. 


PROOF. Apply Corollary 3.29 to fg and then to gf. 


Theorem 3.31. If 
a a ar a 
8 “> F’ Seeds > Fj ay) > 0 
is any free resolution of Z in the category Czg and M is a unital left ZG module, 
then H"(G, M) is canonically isomorphic to the n™ cohomology group of the 


complex in Cz given by 


0 — Homgg(Fi, M) > «.-Homgg(F, M) “> Homzg(F’,,, M) “4 


with d, = Hom(0/, 1) forn > 0. 


PROOF. Let the resolution in the statement of the theorem be Y, and let X be the 
standard free resolution of Z in the category Czg. Two applications of Corollary 
3.26 produce chain maps f : X — Yandg: Y — X over1 : Z— Z. Corollary 
3.30 shows that gf is homotopic to 1 = ly and fg is homotopic to 1 = ly. 
Apply the functor Homzg(-, M) throughout, including to the members of the 
homotopies. Then we obtain chain maps 


Homzg(f, 1) : Homzg(Y, M) > Homzg(X, M) 


+1 


and Homzg(g, 1) : Homzg(X, M) — Homzg(Y, M) 
with 
Homzc(f, 1) o Homzg(g, 1) homotopic to 1 

and Homzc(g, 1) o Homze(f, 1) homotopic to 1. 
Proposition 3.24 allows us to conclude that 

Homzg(f, 1) o Homze(g, 1) induces the identity on H*(Homzg(X, M)) 
and 

Homzg(g, 1) o Homze(f, 1) induces the identity on H*(Homzg(Y, M)). 


Thus Homzg(g, 1) induces an isomorphism of each group H” (Homzg(X, M)) 
onto H”(Homzg(Y, M)). 


158 Il. Brauer Group 
6. Relative Brauer Group when the Galois Group Is Cyclic 


This section has two parts to it. The first part specializes Theorem 3.31 to compute 
group cohomology when the group in question is cyclic of finite order. The second 
part applies this computation to H?(Gal(K/F), K*) and obtains information 
about Brauer groups. As a consequence we obtain new information about the 
classification of noncommutative division algebras. 

Let G be a finite cyclic group of order n. Theorem 3.31 says that if G acts by 
automorphisms on an abelian group M, then H?(G, M) can be computed from 
any free resolution of Z in the category Czg. The standard resolution of Z is one 
such resolution. We shall construct another such resolution that is special to the 
case of G cyclic and that makes the cohomology more transparent. 

Let G = {1,s,57,...,5"~'}. Lemma 3.19 notes that the free abelian group 
on the 1-tuples (1), (s), (s*),... ,(s"7!) is a free ZG module with ZG basis (1). 
In other words, the elements of the left ZG module ZG may be identified with 
the integer linear combinations of these 1-tuples. Define two operators T and N 
from the left ZG module ZG into itself by 


T = multiplication by (s) — (1), 
N = multiplication by (1) + (s) +--- + (s""!). 
Each of these respects addition and commutes with multiplication by (s), hence 


isa ZG module homomorphism. We shall compute the kernel and image of each. 


The kernel of T consists of all elements for which left multiplication by (s) 
fixes the element. The elements of ZG are of the form aa Cj (s/), and (s) times 


this gives cp_1(1) + Bah cj-1(8/). Since (1), (s),... , (s"~!) form a Z basis, 
the condition to be in the kernel of T is that cy_1 = co = Cy = +++ = Cn—2. Thus 


ker T = {c((1) + (9) +--+ + (8""')) | c € Z}. 


Also, 
image T = {integer polynomials in (s) divisible by (s) — (1)} 
= {integer polynomials equal to 0 when s is set equal to 1} 
n—1 F n—-1 
7 | d ei(s/) | Ley =o}. 
j=0 j=0 
In the case of the operator N, we have N(s/) = (1) +(s) +---+(s"7!), and 
therefore N( 5°, ¢j(s/)) = 0, ¢7(() + (s) +--+ + (6"7!)). Hence 
n—1 n-1 
ker N = | Y cj(s/) | C= o| = image T, 
j=0 © j=0 
image N = {c((1) + (s) +--+ (8""')) | c € Z} = kerT. 


6. Relative Brauer Group when the Galois Group Is Cyclic 159 


An immediate consequence of this and a supplementary argument concerning the 
augmentation map is the following proposition. 


Proposition 3.32. If G is a finite cyclic group, then the sequence 


> ZG > ZG donee > ZG > ZG - 0G G7, > 0 


is a free resolution of Z in the category Czg. 


PROOF. We still need to check exactness at the first ZG from the right. The 
map ¢ is the ZG homomorphism with ¢((1)) = 1. Hence ¢((s/)) = 1 for all j, 
and é( ea, eG) = iS cj. Thus kere = ker N = image T, and exactness 
is proved. 


Corollary 3.33. If G is a finite cyclic group and M is an abelian group on 
which G acts by automorphisms, then 


H?(G, M) = M°/((1) + (8) ++ +68") M, 


where M® is the subgroup of all elements of M fixed by G. 


PROOF. Let us number the terms ZG in the resolution of Proposition 3.32 
starting with index 0 from the right. Combining Proposition 3.32 with Theorem 
3.31, we see that we may compute H*(G, M) as the cohomology of the complex 
obtained by applying the functor Homzg(-, M) to the terms with indices 1, 2,3 
in the resolution in Proposition 3.32. Thus H*(G, M) is the cohomology at the 
middle of the complex 


Homzg (ZG, M) + Homzg (ZG, M) 2" Homzg (ZG, M). 


The mapping a + a((1)) of Homzg(ZG, M) into M is one-one and onto, and 
we can identify members a of Homzg(ZG, M) with the corresponding elements 
a((1)) accordingly. If @ is in ker (() ° T), then a(T((1))) = 0, and we thus have 
a((s)) = a((1)) and (s)w((1)) = a((1)). Hence a((1)) is in M%. These steps 
can be reversed, and thus ker (() ° T) = M°. If B is in image (() fo) N), then 
B=aoN forsome a € Homzg(ZG, M), and thus 


B((1)) = oe (() + (5) +++ +677) = a((1)) +)e(D) +: +65" a (()). 


Since a((1)) isa completely arbitrary element of M, we see that image (() oN ) = 
((1) + (9) +--+ + (8"7!)) M, and the result follows. 


160 II. Brauer Group 


Now we specialize to the Galois case that has occupied our attention in this 
chapter. Let K/F be a finite Galois extension of fields. We are going to set G = 
Gal(K /F),n = dimpr K,and M = K”%. To take advantage of Corollary 3.33, we 
suppose that Gal(K /F) is cyclic. Then M° = (K*)° = F™. If x is an element 
of K*, then the orbit Gx is {x, sx, s2x,...,5"7!x}. Remembering that we are 
using additive notation in working with cohomology of groups and multiplicative 
notation in working with K *, we see that the element (() +(s)+-+++ G=)) 
of ZG is to be regarded as operating by giving the product of the members of an 
orbit in K*. This product for the orbit of x € K* is Nx;r(x), and Corollary 
3.33 thus specializes to the following result. 


Corollary 3.34. If K/F is a finite Galois extension of fields such that 
Gal(K /F) is cyclic, then 


H?>(Gal(K /F), K*) = F*/Nx/p(K*). 


Corollary 3.34 considerably simplifies the proofs of Frobenius’s Theorem 
about division algebras over the reals (Theorem 2.50) and Wedderburn’s Theorem 
about finite division rings (Theorem 2.48), and thus the theory in Chapter III has 
added something to the theory of Chapter II even in these very special situations. 
In the case of the Frobenius theorem, the only nontrivial algebraic extension of 
R is C, and thus Theorem 3.14 and Corollary 3.34 give 


B(R) = B(C/R) = H?(Gal(C/R), R*) 
= R*/Ner(C*) = R*/(R*)t = Z/2Z. 


Hence the reals and the quaternions are the only finite-dimensional central simple 
division algebras over R. 

In the case of the Wedderburn theorem, suppose that a finite field K splits a 
central division algebra over a field F with q elements. Say that |K| = q”. For 
finite fields the Galois groups are always cyclic, and thus Gal(K /F) is cyclic of 
order n, generated by the map x +> x4. In view of Corollary 3.34, the Wedderburn 
theorem follows if F'* ‘if Nx /F(K”) is shown to be trivial, i.e., if the norm map 
Nx /r : K* — F™* is onto. The group K™* is cyclic, say with a generator xo of 
order g” — 1. Since the norm of an element is the product of the images under 
the Galois group, the norm of xo is given by 


2 n—1 1 ne n—1 = 
Nxjr(%o) = xoxgxy XG XQ gh 


This has order g — 1, not less, and thus is a generator of F*. Thus the norm map 
is onto F*. 


6. Relative Brauer Group when the Galois Group Is Cyclic 161 


For a more difficult example that we can settle completely, consider the case 
that F = Qand K = Q(,/m) for a square-free integer m other than 1. The 
Galois group in this case is a 2-element group and is in particular cyclic. Thus 
Corollary 3.34 applies. The norm of the member x + y./m of K, where x 
and y are in Q, is x? — my”. The problem of determining the quotient group 
F* / Nx/Q(K~) may be rephrased in terms of genera as in Section 1.5. Specifi- 
cally the field discriminant D is defined to be m if m = 1 mod 4 and to be 4m if 
m # 1 mod 4. A genus for Q(./m ) is an equivalence class of primitive quadratic 
forms ax? + bxy + cy* whose discriminant matches the field discriminant D, 
except that the theory of Chapter I discards all negative definite forms. Equiv- 
alence is determined by the action of SL(2, Q). Lemma 1.13 shows for D > 0 
that each nonzero rational number is a value taken on by the members of one and 
only one genus at points (x, y) 4 (0, 0) with x and y both rational; for D < 0, 
Lemma 1.13 applies to positive definite forms and positive rational numbers. Let 
us now enlarge the definition of genera to include negative definite forms and 
negative rational numbers when D < 0. 

The definition of the multiplication of classes of forms is set up so as to 
be compatible with multiplication of the values of the quadratic forms, and the 
genera define a group, the identity element being the principal genus. Since 
a representative of the principal genus is x* — my”, the nonzero rational val- 
ues corresponding to the principal genus are exactly the members of the group 
Nx/g(K~). Consequently the quotient group F~ / Nx/Q(K™*) is isomorphic to 
the group of genera.* The easy result concerning the group of genera is Theorem 
1.14, which says that this group is finite abelian and that every nontrivial element 
has order 2; since B(K /F) = F* /NxQ(K *), Corollary 3.15 gives another way 
of seeing that every nontrivial element has order 2. The hard result, which appears 
in Problems 25-29 at the end of Chapter I, identifies the order of the group of 
genera explicitly.> If D > 0, then the order of the group of genera is 2% , where 
g’ + 1 is the number of distinct prime divisors of D; if D < 0, then the order of 
the group of genera is 2%+!, 

Consequently if m has g + 1 distinct prime divisors, then the relative Brauer 
group is a product of 2-element groups whose order is given by 


28 ifm > Oandm 43 mod 4, 

ifm > 0 andm = 3 mod 4, 
2st! ifm <Oandm #3 mod 4, 
28+? ifm <Oandm = 3 mod 4. 


|BQ(m )/Q| = 


4With the understanding that genera from negative definite forms are to be allowed if D < 0. 
5In quoting this result, we are now making allowances for genera corresponding to negative 
definite forms. 


162 Il. Brauer Group 


The example with K /Q quadratic shows the kind of information that has to go 
into a complete determination of the relative Brauer group when K /Q is Galois. 
Showing that a relative Brauer group is nontrivial in a case with Gal(K /Q) cyclic 
is considerably easier. According to Corollary 3.34, all one needs to know is that 
the norm function does not carry K * onto Q*, and congruence conditions can 
be used as a first step in addressing this question; Problem 4 at the end of the 
chapter illustrates this principle. Problems 15-17 at the end of Chapter II give 
a construction in this situation of nontrivial central simple algebras over Q that 
are split by K, and such algebras whose dimension is the square of a prime are 
necessarily division algebras. Problems 6—12 at the end of the present chapter 
give a sufficient condition for obtaining a division algebra when the dimension is 
not the square of a prime. 


7. Problems 


1. Let A be a finite-dimensional central simple algebra over a field F, let K be a 

subfield of A, and let B be the centralizer of K in A. 

(a) Arguing as in the proof of Theorem 3.3, exhibit a one-one algebra homo- 
morphism A @r K — Endgo A. 

(b) Referring to the proof of Theorem 2.2 and counting dimensions with the aid 
of the Double Centralizer Theorem, prove that the mapping in (a) is onto 
End Bo A. 

(c) Deduce that A @Ff K and B yield the same member of 6(K). 


2. Leta = a(o,T) be a 2-cocycle in Z*(Gal(K/F), K*), where K/F is a finite 
Galois extension of fields. Prove for each t that |] o€Gal(K/F) a(o, T) liesin F*. 


3. Let K/F bea finite Galois extension of fields with Gal(K /F’) cyclic. Corollary 
3.34 identifies H4(Gal(K /F), K*) for g = 2. Identify this group for all other 
values of g > 0. 


Problems 4—5 amplify the discussion of cyclic algebras that was begun in Problems 
17-19 at the end of Chapter II. Problem 4 in effect produces an explicit division 
algebra of dimension 9 over Q, and Problem 5 hints at the existence of an explicit 
division algebra of dimension n? over Q for each integer n > 1. 


4. Let¢ =e?7'/7, and let K = Q(¢) NR. 
(a) Show that K /Q is a Galois extension of degree 3, that a basis for K over 
Q consists of t) = 6 +¢7!, m = ¢7 +077, 3 = ¢3 +073, and that the 
Galois group permutes 1), T2, 73 cyclically. 
(b) Show that if a, b, c are in Q, then 


7. Problems 163 


Nxjo(at + bt2 +c73) = abc(t? + t3 +t) 
+(2 +b +3 + 3abe)t m3 
+ (a*b recat b’c)(t/t2 + 1513 + 11) 
+ (a°c + ab* + bc?) (1 a + TTS + T3T/). 
(c) Verify the following identities: 
T+tH+R=-l, 
MM=M+3;, M3Z=MZ+7%, 1173 = Ty +72, 
=) +2, v=34+2, w= +2. 
(d 


wm 


Combine (b) and (c) to show that 
Nxg(ati + bt +ct3) = (a +b? +3) — abc 
+ 3(a*b +.ac? + b*c) — 4(a72c + ab” + be’). 


(e) Under the assumption that a, b, c are integers with GCD(a, b, c) = 1, show 
that Nx /g(at + bt2 + ct3) #0 mod 3. 

(f) Deduce from (e) that r = 3 is not in Nxjg(K~*). (Educational note: 
Consequently Problems 18-19 at the end of Chapter II produce an explicit 
division algebra over Q of dimension 9.) 

5. (a) Show for each integer n > | that there exists a prime p such that n divides 

p-l. 

(b) Deduce for this p that there exists a field L with Q C L C Q(e?7'/P) such 
that the field extension L/Q is a Galois extension whose Galois group is 
cyclic of order n. 


Problems 6—12 continue the discussion of cyclic algebras that was begun in Problems 
17-19 at the end of Chapter IT and continued in Problems 4—5 above. Let F be any 
field, and let K be a finite Galois extension of F whose Galois group G = Gal(K/F) 
is cyclic of order n. Let o be a generator of G, fix an elementr 4 0 in F, and let A 
be the subset of matrices in M,(K) of the form 

Cc) C2 C3 Rta? Cn 


ro(Cn) o(c1) a(c2) +++ O(Cn-1) 
ro*(Cy-1) ror(cn) 07 (C1) + + + 67 (Cn-2) 


ro"™"(c2) ro""(c3) ro" !(c4) - » - o"~!(e1) 
Identify c € K with the diagonal member of A for which c; = c andc2 = --- = cp = 
0, and let j be the member of A for which c} = 0, cz = 1, andc3 =--- = cy, = 0. 
Under this identification every member of A has a unique expansion as )77_; Cx per 
with all c, in K, and the element j satisfies 7” = r and Fries = o(c) force K. 
Take it as known that A is a central simple algebra over F of dimension n?. This 
series of problems leads in part to another theorem due to Wedderburn. (However, a 


164 Il. Brauer Group 


more direct proof of the theorem of Wedderburn without the other results is possible.) 


6. In the construction of factor sets in Section 2, use xx = j* forO<k <n—1. 
Show that the algebra A above corresponds to the 2-cocycle a with 


1 ifk+l1 <n, 
a(o*,o') = 
r 


ifk+l>n. 

7. Under the assumption thatr = Nx;r(x) with x € K%*, show that the choice 
Cok = XO (x)o7 (x) Se aie (x) exhibits the factor set of the previous problem as 
a trivial factor set and hence shows that A = M,,(F). 

8. Let F = {Fx} be the standard free resolution of Z in Czg, and let X = {Xx} 
be the free resolution of Proposition 3.32. The latter has X, = ZG for every 
k > 0. Trace through the proof of Corollary 3.26, and show that the proof allows 
a chain map f = {f;} to be defined in such a way that the values of fo, fi, fo 
on standard ZG basis elements of Fo, Fi, F2 are fol) = 1, fid, o*) — 
—(l+o+---+o'~!) for0 <k <n, and 

mal k gl) 0 if0<k<Il<n, 
’ oO ’ oO — 
: -o! — if0 <1 <k <n. 

9. Let ®) : Homzg(Fo, K*) > C2(G, K*) be the isomorphism of Section 5, 
and let y be in Homzg(ZG, K*). Show that the member of C?(G, K*) that 
corresponds to w is ®2( o fo) and that 

w (0) ifk+Il<n, 

y(okH—ay-! ifk+1l>n. 

10. Let y be a member of K *, and let w be the unique element of HomzG (ZG, K*) 
with w(1) = y. Why in the context of Proposition 3.32 is y a 2-cocycle if and 
only if y isin F*? 


(yo fo)(o*,o') = 


11. Take y as in the previous problem with y(1) = r—!, and show that the member of 
C2(G, K*) that corresponds to it under Problem 9 is the factor set a of Problem 6. 


12. Deduce from the previous problem that the order of the Brauer equivalence class 
in B(K /F) is the order of the coset of r in F* / Nxjr(K*). Why does it follow 
that A is a division algebra over F if the coset of r in F* / Nx/F(K“~) has exact 
order n? (Educational note: This result is a theorem of Wedderburn except that 
it is here dressed in more modern language. The special case that n is prime 
was already handled by Problems 18-19 at the end of Chapter II. Although the 
converse was seen in those problems to be valid for n prime, the converse is 
known to fail for n = 4.) 


Problems 13-20 introduce the reduced norm of a central simple algebra and give an 
application. Let A be a central simple algebra over a field F with dime A = n?. For 
a in A, the algebra polynomial of a is defined to be the characteristic polynomial 
det(X 1 — A) of the F linear mapping L(a) : A — A given by the left multiplication 


7. Problems 165 
x ++ ax. This monic polynomial lies in F[X] and has degree n*. The ordinary 
norm N44 (a) is defined to be (-1)” times the constant term, and the ordinary 
trace Tr 4/r (a) is defined to be minus the coefficient of X mol, these functions of a 
take values in F. Choose a finite Galois extension K of F that splits A, and fix an 
isomorphism yg : A@r K — M,(K). The reduced polynomial of a is defined to be 
the monic polynomial det (p(X 1-—a® 1). This polynomial lies in K[X] and has 
degree n. The reduced norm Nrd4/;(q) is defined to be (—1)” times the constant 
term, and the reduced trace Trrd,4/(a) is defined to be minus the coefficient of 
X"—!; these functions of a initially take values in K. 
13. Prove that the reduced polynomial of a does not depend on the choice of the 
isomorphism @. 
14. Prove that det(X 1 — a) = det (y(X1—a@ 1))". 
15. Using Galois theory and unique factorization, prove that any monic polynomial 
P(X) in K[X] such that P(X)” lies in F [X] already lies in F[X]. Conclude that 
the reduced polynomial of any element of A is in F[X]. 
16. Prove that det (p(X 1-—a® 1)) does not depend on the choice of the Galois 
extension K of F that splits A. 


17. Deduce that Nrd4/r is a function from A to F such that Nrda/r(ab) = 
Nrd4/r(a)Nrd,/p(b) for all a and b in A, Nrd4;r(1) = 1, and Nrda/r(a)”? = 
Na;r(@) for all a in A. How does it follow that 
(a) anelement a € A is invertible if and only if Nrd4/r(a) 4 0 and 
(b) A is a division algebra if and only if Nrd4;r (a) = 0 only for a = 0? 

18. Let K/F bea finite Galois extension of fields, put G = Gal(K /F), and suppose 
that a crossed-product algebra A = A(K, G, a) is given as in Proposition 3.12 
with K C A and with dime A = (dimp K)* = n?. Let {xo | o € G} be the 
system in the proposition such that A = @,<g Kx. Associate a matrix m(v) 
in M,(K) to each v € A as follows. The rows and columns of the matrices are 
indexed by G, and E,., denotes the matrix that is 1 in the (o, T) entry and is 0 else- 
where. Let m(cxr) = 0, o(c)a(o, T)Eo,or forc € K, and extend additively to 
handleallv € A. Check that v  m(v) isaone-one F algebra homomorphism of 
A into M,(K), and prove that Nrd4/r(v) = detm(v). (Educational note: Thus 
by Proposition 3.12 the matrix algebra in Problems 6—12 is central simple.) 


19. Identify the norm and the reduced norm for the real algebra H of quaternions. 


20. A field F is said to satisfy condition (C1) if every homogeneous polynomial 
of degree d in n variables with d < n has a nontrivial zero. Using the reduced 
norm for a central division algebra over F’, prove that condition (C1) implies 
that B(F) = 0. (Educational note: Algebraically closed fields and finite fields 
satisfy (C1), the latter by a theorem of Chevalley. A deeper fact is that a simple 
transcendental extension of an algebraically closed field satisfies (C1); see the 
Problems at the end of Chapter VIII.) 


CHAPTER IV 


Homological Algebra 


Abstract. This chapter develops the rudiments of the subject of homological algebra, which is an 
abstraction of various ideas concerning manipulations with homology and cohomology. Sections 
1-7 work in the context of good categories of modules for a ring, and Section 8 extends the discussion 
to abelian categories. 

Section 1 gives a historical overview, defines the good categories and additive functors used in 
most of the chapter, and gives a more detailed outline than appears in this abstract. 

Section 2 introduces some notions that recur throughout the chapter—complexes, chain maps, 
homotopies, induced maps on homology and cohomology, exact sequences, and additive functors. 
Additive functors that are exact or left exact or right exact play a special role in the theory. 

Section 3 contains the first main theorem, saying that a short exact sequence of chain or cochain 
complexes leads to a long exact sequence in homology or cohomology. This theorem sees repeated 
use throughout the chapter. Its proof is based on the Snake Lemma, which associates a connecting 
homomorphism to a certain kind of diagram of modules and maps and which establishes the exactness 
of a certain 6-term sequence of modules and maps. The section concludes with proofs of the crucial 
fact that the Snake Lemma and the first main theorem are functorial. 

Section 4 introduces projectives and injectives and proves the second main theorem, which 
concerns extensions of partial chain and cochain maps and also construction of homotopies for 
them when the complexes in question satisfy appropriate hypotheses concerning exactness and the 
presence of projectives or injectives. The notion of a resolution is defined in this section, and the 
section concludes with a discussion of split exact sequences. 

Section 5 introduces derived functors, which are the basic mathematical tool that takes advantage 
of the theory of homological algebra. Derived functors of all integer orders > 0 are defined for any 
left exact or right exact additive functor when enough projectives or injectives are present, and they 
generalize homology and cohomology functors in topology, group theory, and Lie algebra theory. 

Section 6 implements the two theorems of Section 3 in the situation in which a left exact or right 
exact additive functor is applied to an exact sequence. The result is a long exact sequence of derived 
functor modules. It is proved that the passage from short exact sequences to long exact sequences 
of derived functor modules is functorial. 

Section 7 studies the derived functors of Hom and tensor product in each variable. These are 
called Ext and Tor, and the theorem is that one obtains the same result by using the derived functor 
mechanism in the first variable as by using the derived functor mechanism in the second variable. 

Section 8 discusses the generalization of the preceding sections to abelian categories, which are 
abstract categories satisfying some strong axioms about the structure of morphisms and the presence 
of kernels and cokernels. Some generalization is needed because the theory for good categories is 
insufficient for the theory for sheaves, which is an essential tool in the theory of several complex 
variables and in algebraic geometry. Two-thirds of the section concerns the foundations, which 
involve unfamiliar manipulations that need to be internalized. The remaining one-third introduces an 


166 


1. Overview 167 


artificial definition of “member” for each object and shows that familiar manipulations with members 
can be used to verify equality of morphisms, commutativity of square diagrams, and exactness of 
sequences of objects and morphisms. The consequence is that general results for categories of 
modules in homological algebra requiring such verifications can readily be translated into results for 
general abelian categories. The method with members, however, does not provide for constructions 
of morphisms member by member. Thus the construction of the connecting homomorphism in the 
Snake Lemma needs a new proof, and that is given in a concluding example. 


1. Overview 


This chapter develops the rudiments of the subject of homological algebra. The 
only prerequisite within the present volume is the self-contained Section III.5 
entitled “Digression on Cohomology of Groups,” which is helpful primarily as 
motivation. The definitions of category, functor, object, morphism, natural trans- 
formation, product, and coproduct as in Chapters IV and VI of Basic Algebra will 
be taken as known, and it will be helpful as motivation to know also the material 
from Chapter VII of Basic Algebra on group extensions and cohomology of 
groups. The present chapter will make some allusions to notions from algebraic 
topology, particularly in this first section, and the reader is encouraged to skip 
lightly over anything of this kind that might be an impediment to continuing with 
the remainder of the chapter. 

Homology and cohomology have their origins in attempts to assign algebraic 
invariants to topological obstructions. One example historically was the holes 
in a domain of the Euclidean plane that can make line integrals that are locally 
independent of the path fail to be globally independent of the path. Another was 
the handles on 2-dimensional closed surfaces. These obstructions were originally 
viewed as numbers (Betti numbers for example) and later viewed as algebraic 
objects such as abelian groups or vector spaces. A big advance was to regard 
them not just as objects attached to geometric configurations but as functors that 
attach objects to geometric configurations and also attach functions between such 
objects to reflect the behavior of functions between geometric configurations. 

Hints of connections with algebra on a deeper level and hints that homology and 
cohomology could be computed quite flexibly began with work of W. Hurewicz 
in 1936 and H. Hopf in 1942. Hurewicz considered the following situation: M 
is a finite connected simplicial complex, U is its universal cover, and G is the 
fundamental group of M. Suppose that U is contractible. The group G acts freely 
on the group C,.(U) of simplicial chains of U (with integer coefficients). The 
boundary operator then gives us an exact sequence 


0<—2Z4<—C(U) — CyU) — CxU) — -:- 


of abelian groups with an action of G on each C;(U) by automorphisms in such 
a way that each C;(U) in effect is a free ZG module. Applying (-) ®zgG Z, we 


168 IV. Homological Algebra 


obtain the complex 
0 <— Co(M) — Ci(M) — C2(M) — ---. 


The homology Ho(M) is just Z because M is connected, and H;(M) is just the 
quotient of G by its commutator subgroup; thus Hp(M) and H;(M) depend only 
on G. What Hurewicz showed is that all higher H;(M) depend only on G; he did 
not address existence of such spaces M and U for G. 

Hopf clarified the situation and drew attention to it by making an explicit 
calculation: Dropping all assumptions on U other than its simple connectivity, 
he gave a formula for the quotient of H2(/) modulo the subgroup of “spherical 
homology classes” in terms of G. Later he obtained a result for higher-degree 
homology. In effect, Hopf was giving formulas for H,,(G, Z) by discovering and 
applying the homology analog of the cohomology result given as Theorem 3.31 
in Section II.5. 

Meanwhile, S. Eilenberg in 1944 made an adjustment to Lefschetz’s singular 
homology theory and showed for locally finite polyhedra that his adjusted theory 
gives the same groups as the more traditional simplicial theory. His method 
was to introduce a third complex, to exhibit chain maps from this to each of 
the complexes under study, and show that the chain maps possess inverses in a 
suitable sense. 

In addition to the people mentioned above, some others who pursued these mat- 
ters in the mid 1940s were R. Baer, B. Eckmann, H. Freudenthal, and S. Mac Lane. 
One thing that mathematicians gradually realized was that homology and coho- 
mology in various situations can be calculated from suitable kinds of abstract 
resolutions, a fact that lies at the heart of the subject of homological algebra. 
Another was that the subject of cohomology of groups made sense on an abstract 
level without any reference to topology and that the theory of factor sets for group 
extensions, as had been introduced by O. Schreier in the 1920s, was actually one 
aspect of this theory. 

With a great leap of generality, H. Cartan and Eilenberg set down such a theory 
in their celebrated book Homological Algebra, whose publication was delayed 
until 1956. Homology and cohomology became things attached to complexes, 
no longer dependent on topology, and the book developed enormous machinery 
for working with such complexes and homology/cohomology. By the time that 
Cartan and Eilenberg had published their book, other special cases of homological 
algebra had already arisen. One was the cohomology theory of Lie algebras, 
developed by C. Chevalley in the 1940s and by J.-L. Koszul in 1950. Another was 
the cohomology theory of sheaves, used in the subject of several complex variables 
starting about 1950 by K. Oka and H. Cartan; sheaves themselves had been 
introduced in 1946 by J. Leray in connection with partial differential equations. 

In the eventual theory the fundamental notion is that of a “derived functor”: 
homology or cohomology is obtained by starting from some kind of resolution, 


1. Overview 169 


or exact complex, passing to another complex by means of a functor with some 
special properties, and then extracting the homology or cohomology of the image 
complex. Two categories are thus involved, one for the resolution and one for 
the values of the functor. From an expository point of view, it seems wise to start 
with concrete categories and not to try to identify the most general categories for 
which the theory makes sense. For much of the chapter, we shall work with a 
category not much more general than the category Cr of all unital left R modules, 
where R is a ring with identity, and our functors will pass from one such category 
to another. Use of categories Cr subsumes the following applications: 


(i) manipulations with basic homology and cohomology in topology, in 
which one begins with the ring R = Z of integers. For more advanced 
applications in topology, one moves from Z to more general rings. 

(ii) homology and cohomology of groups, in which one initially uses group 
rings of the form ZG, where G is any group and Z is the ring of integers. 

(iii) homology and cohomology of Lie algebras. If g is a Lie algebra over 
a field such as C, then g has a “universal enveloping algebra” U(g) 
and a canonical mapping « : g — U(g). Here U(g) is a complex 
associative algebra with identity, ¢ is a Lie algebra homomorphism, and 
the pair (U (g), 4) has the following universal mapping property: when- 
ever y : g — A is a Lie algebra homomorphism into a complex asso- 
ciative algebra A with identity, then there is a unique homomorphism 
® : U(g) > A of associative algebras with identity such that gp = ® ou. 
Lie algebra homology and cohomology are the theory for the set-up in 
which the initial underlying rings are U(g) and C. 


In other words, in each of the three applications above, many derived functors of 
importance pass from the category Cr for a ring R with identity to the category 
Cy for another ring S with identity. 

The slight generalization of categories Cr that we shall use for much of the 
chapter is as follows: Let R be a ring with identity. A good category C of R 
modules consists of 


(i) some nonempty class of unital left R modules closed under passage 
to submodules, quotients, and finite direct sums (the modules of the 
category), 

(i) the full sets Homa(A, B) of all R linear homomorphisms from A to B 
for each A and B as in (i) (the morphisms, or maps, of the category). 


For example the collection of all finitely generated abelian groups, as a subcate- 
gory of Cz, isa good category.! So is the collection of all torsion abelian groups, 


'One reason for working with this slight generalization is to emphasize that a certain property 
of categories Cr, namely that they have “enough projectives” and “enough injectives” in a sense to 
be made precise below in Section 5, does not necessarily persist for slight variants of Cr. 


170 IV. Homological Algebra 


i.e., abelian groups whose elements all have finite order, as a subcategory of Cz. 

The definition of “good category” specifies /eft R modules that are unital. 
However, the theory applies equally well to right R modules that are unital, since 
a unital right R module becomes a unital left module for the opposite ring R°, 
iLe., the ring whose underlying abelian group is the same as for R and whose 
multiplication is given by ao b = ba. 

The special property of a functor F : C > C’ used for passing from a complex 
in one good category to a complex in another good category is that it is additive, 
namely that F(g; + g2) = F(g,) + F(@2) whenever g, and @¢ are in the 
same Home(A, B). The initial examples of additive functors are tensor product 
M ®pr (-), which passes from Cr to Cz if M is a right R module, and Hom in 
each variable: Homr(-, M) and Homer(M, -), both of which pass from Cr to 
Cz if M is a left R module. In Section 2 we shall consider additive functors in 
more detail. 

The set-up with good categories does not subsume the cohomology of sheaves, 
nor some other applications of interest, such as the cohomology of vector bundles 
with a fixed base. The cohomology of sheaves is an important tool in algebraic 
geometry and several complex variables, and it cannot be ignored. Consequently 
one ultimately wants the theory to extend to other categories than good categories 
of modules. In addition, it is quite useful to have the theory work for the categories 
opposite to two given categories if it works for two given categories, and this 
feature means that the general theory should not insist that the objects be sets 
of elements and the morphisms be functions on such elements. Accordingly the 
abstract theory is carried out for “abelian categories,” which will be defined in 
Section 8. The idea for creating the abstract theory is to take the theory for good 
categories of modules and rephrase all of the results for all abelian categories. In 
many instances the proofs will translate easily to the general setting, but in other 
instances it will be necessary to eliminate individual elements from arguments 
and obtain new arguments that rely only on complexes, exact sequences, and 
commutative diagrams. Some of this detail will be carried out in Section 8. 

Sections 2-3 establish the framework of homology and cohomology in the 
context of good categories of modules. Section 2 discusses complexes and exact 
sequences at length, and Section 3 shows how a short exact sequence of complexes 
leads to a long exact sequence in homology or cohomology. This is the first main 
result of the theory and finds multiple uses later in the chapter. 

Section 4 contains a discussion of “projectives and injectives” that expands and 
systematizes Theorem 3.31, which concerned the flexible role of resolutions in 
computing the cohomology of groups. Once that flexibility is in place in the more 
general setting of good categories, Sections 5—6 introduce derived functors and 
some of their properties. The main examples of derived functors at this stage are 
functors Ext(-, -) and Tor(-, -) obtained from Hom and tensor product; these 


2. Complexes and Additive Functors 171 


are examined more closely in Section 7. The example given in Section HI.5 and 
now being used as motivation requires some subtlety to be regarded as a derived 
functor. That example was the system of functors H”(G, -) yielding cohomology 
of the group G with coefficients in the module (- ); these were obtained in Section 
IIL5 by applying the functor Homzg(-, M) to any free resolution of Z in the 
category Czg. It is seen in examples in Section 5 that the effect of using the free 
resolution was to compute H"(G, M) as Ext7,(-, M) when the variable is set 
equal to Z; realizing this result as a derived functor in the M variable requires 
knowing that one gets the same result from Ext;,,(Z, -) when its variable is set 
equal to M. This conclusion is part of Theorem 4.31, which is proved in Section 7. 

The first seven sections complete the treatment of the rudiments of homological 
algebra in the setting of good categories. One more central technique beyond that 
of derived functors is the mechanism of spectral sequences, but we shall omit this 
topic to save space.” 

The chapter concludes with some discussion of abelian categories in Section 8. 
The foundations of homological algebra have to be redone completely when 
objects are no longer necessarily sets of elements. After this step, one introduces 
a substitute notion of “member” for elements, establishes its properties, and 
immediately obtains extensions of much of the theory to all abelian categories. A 
supplementary argument is needed whenever the theory for good categories uses 
an element-by-element construction of a homomorphism. 

Sheaves are introduced in the last section of text in Chapter X, and their 
cohomology is mentioned very briefly there. 


2. Complexes and Additive Functors 


Let C be a good category of R modules in the sense of Section 1. A complex in C 
is a finite or infinite sequence of modules and maps in C such that the consecutive 
compositions are all 0. There is no harm in assuming that the indexing for 
the sequence is done by all of Z, since we can always adjoin 0 modules and 0 
maps as necessary to fill out the indexing. The indices may be increasing or 
decreasing, and, as we saw in Section III.5, this distinction is only a formality. 
However, the distinction is very convenient when it comes to applications, since 
homology is normally associated with decreasing indices and cohomology is 
normally associated with increasing indices. 

Thus let us be more precise about the indexing. A chain complex in C is 


a sequence of pairs X = {(X,, 0,)}°2_., in which each X,, is a module in C, 

>For the reader who is interested in learning about spectral sequences, this author is partial to the 
explanation of the topic in Appendix D of the book by Knapp and Vogan in the Selected References. 
The setting in that appendix is limited to good categories of modules, and some important applications 
are included. 


172 IV. Homological Algebra 


each 0, is a map in HomR(Xy41, Xp), and 0,0,4; = O for all n. The maps 
0, are sometimes called boundary maps, or boundary operators. We define 
the homology of X, written H,.(X) = {H,(X)}"° _.. with subscripts, to be the 
sequence of modules in C given by 


Hi, (X) = (ker 0,_1)/(Gimage 0,). 


The members of the space ker 0,1 are called n-cycles, and the members of the 
space image 0, are called n-boundaries. 


EXAMPLES OF CHAIN COMPLEXES. 


(1) Simplicial homology. Let S$ be a simplicial complex of dimension N, and 
number its vertices. For each integer n, the group C,(S) of simplicial n-chains 
is the free abelian group on the set of simplices of dimension n. This is 0 for 
n <Oandn > N. In elementary topology one defines the boundary of each 
n-simplex to be the member of C,,_; (S$) equal to an integer combination of its 
faces, the coefficient of the face being (—1)! if the missing vertex for the face is 
the i of the n + 1 vertices of the given n-simplex. This definition is extended 
additively to the boundary map 0,-; : C,(S) — C,y—1(S), and a combinatorial 
argument gives 0,0,-; = O for all n. Thus X = {(C,(S), 0,-1)} is a complex. 
The associated homology H,(X) is the n™ (integral simplicial) homology of the 
simplicial complex S' and is usually denoted by H,,(S). 


(2) Cubical singular homology. Let S$ be a topological space. For n > 0, a 
singular n-cube in S is a continuous function T : 1” — S, where J” denotes the 
n-fold product of the closed interval [0, 1] with itself. The free abelian group on 
the set of n-cubes is denoted by Q,,(S). A singular n-cube T is degenerate if 
its values are independent of one of the n variables. The subgroup of Q,(S) 
generated by the degenerate singular n-cubes is denoted by D,(S), and the 
quotient C,(S) = Q,(S)/D,(S) is the group of cubical singular n-chains. 
One defines a boundary operator from Q,,(S) to Q,-;(S) for each n in analogy 
with the definition in the previous example and shows that it carries D,,(S) into 
D,—1(S). Consequently the boundary operator descends to a homomorphism of 
abelian groups 0,—1 : Cn(S) > Cy_1(S). A combinatorial argument shows that 
In On—1 = 0; thus we get a complex. The associated homology is the n™ (integral 
singular) homology of S and is usually denoted by H,,(S). 

(3) Free resolution of Z in Czg. Let G be a group. Then the standard resolution 
of Z in the category Czg, as given in Theorem 3.20, is a chain complex in that 
category. 


Let us make the class of chain complexes for the good category C into a category. 
Each chain complex is to be an object. If X = {(Xn, dn}) and X’ = {(X/,, a/)} 


2. Complexes and Additive Functors 173 


are two chain complexes in C, a morphism in Morph(X, X’) is any chain map 
f = {fn}, defined as a sequence of maps f, € Homp(X,, X),) such that the 
diagram 

On=1 


Xn Xn-1 
fn [ha 
n=l 
Xx} Xx! 
commutes for all n. Briefly fo = 0’f. Since the f,,’s are functions, it is 


customary to use function notation f : X — X’forchain maps. The system {1x, } 
of identity maps serves as an identity morphism, and coordinate-by-coordinate 
composition is associative. Thus the result is a category. 

The next step is to observe that homology H,,, as applied to chain maps for 
the category C, is a covariant functor from the category of chain maps to itself. 
The effect of the functor on objects is to send X to H,(X) = {(H,(X), 0)}. If 
f : X — X’ is a chain map, then the formula 0", (fn(%n)) = fn—1(On—1 Xn) 
shows that f,(kerd,-;) C kerd)_,, and the formula 0/(fi41Qn4i)) = 
fn(On(Xn41)) shows that f,(imaged,) C imaged’. Therefore f, descends 
to the quotient, giving a map A(f,) : H,(X) — H,(X'). The assembled 
collection of maps H,(f) : H,(X) — H,(X") is manifestly a chain map. Instead 
of writing H(f,,) for the map induced by f,, on the n™ homology, we shall often 
write (fn) OF he especially in diagrams, to make the notation less cumbersome. 
Since the identity chain map yields the identity on H,,(X) and since compositions 
go to compositions in the same order, homology H,, is a covariant functor. 

If f : X > X’ and g : X > X’ are two chain maps, then a homotopy / 
of f to g is a system of maps h = {h,} increasing degrees by 1, ie., having 
h, carry X;, into Saree such that hy_10,-1 + 0)hy = fn — Sn for all n. Briefly 
hd + 0’h = f — g. When such an hf exists, we say that f and g are homotopic, 
and we write f ~ g. This relation is an equivalence relation. 


Proposition 4.1. If f : X — X’ and g : X — X’ are homotopic chain maps 
in the good category C, then f and g induce the same maps H,.(f) and H,(g) 
on homology, i.e., H,(f) and H,,(g) are the same map of H,,(X) into H,,(X’) for 
each n. 


PROOF. Let i be a homotopy, and suppose that 0,-;(x,) = 0. Then the 
computation Sn (Xn) —8n (Xn) 7 hy-1 On—1 (Xn) + Oy (Xn) =0O+ Oy (Xn) shows 
that the images of x, under f,, and g, in X’, differ by a member of image 0/. 


Briefly let us translate all of these definitions and conclusions into statements 
when the complexes have increasing indices. A cochain complex in C is a 


174 IV. Homological Algebra 


sequence of pairs X = {(X,,d,)}°2_., in which each X, is a module in C, 


each d, is a map in Homer(X,,, Xn41), and d,41d, = 0 for all n. The maps d,, 
are sometimes called coboundary maps, or coboundary operators. We define 
the cohomology of X, written H*(X) = {H"(X)}°° _.. with superscripts, to 
be the sequence of modules in C given by H"(X) = (kerd,)/(image d,_1). The 
members of the space ker d, are called n-cocycles, and the members of image d,_1 


are called n-coboundaries. 


EXAMPLES OF COCHAIN COMPLEXES. 


(1) Singular cohomology. Let S be a topological space, let X = {(C,,(S), dn-1)} 
be its complex of cubical singular n-chains, and let M be any abelian group. If 
C"(S, M) = Homz(C,(S), M) and if d, : C’(S,M) > C"t!(S, M) is the 
map d, = Hom(dn41, 1), then Y = {(C”(S, M)), d,)} is a cochain complex, 
and its cohomology, written H*(Y) = {H"(S, M)}, is the (integral singular) 
cohomology of S with coefficients in M. 


(2) Cohomology of groups. Let G be a group, and let M be an abelian group 
on which G acts by automorphisms. Let C”(G, M) be the abelian group of 
functions from the n-fold product of G with itself into M, the functions being 
added pointwise. Define 6, : C"(G,M) > c"+!(G, M) as in Section IIL5. 
Then X = {(C"(G, M), 6,)} isa cochain complex, and its cohomology H*(X) = 
{H"(G, M)} is the cohomology of G with coefficients in M. 


The cochain complexes for the good category C form a category for which the 
morphisms from X = {(X,, d,)} to X’ = {(X/,, d))} are cochain maps f = { f,,}; 
the latter are defined by the conditions that f,, carry X, to X/ and fd =df,i.e., 
Fntidn = anf for all n. Cohomology H%*, as applied to cochain maps for the 
category C, is a covariant functor from the category of cochain maps to itself. 
The effect of the functor on objects is to send X to H*(X) = {(H”(X), 0)}, and 
the argument that a cochain map f : X — X’ carries H*(X) to H*(X’) viaa 
cochain map H*(f) is the same as for chain maps. Instead of writing H(f,) for 
the map induced by f,, on the n cohomology, we shall often write (f,)* or fn, 
especially in diagrams, to make the notation less cumbersome.? 

If f : X — X’ and g : X —> X’ are two cochain maps, then a homotopy 
h of f to g is a system of maps h = {h,,} decreasing degrees by 1, i.e., having 
h, carry X, into X)_,, such that hy4id, + d)_jhn = fn — Sn for all n. Briefly 
hd +d’'h = f — g. When such an h exists, we say that f and g are homotopic, 
and we write f ~ g. This relation is an equivalence relation. 


3The notation with the bar is to be avoided when there might be some ambiguity about which of 
homology and cohomology is involved. 


2. Complexes and Additive Functors 175 


Proposition 4.1’. If f : X — X’ and g : X — X’ are homotopic cochain 
maps in the good category C, then f and g induce the same maps H*(f) and 
H*(g) on cohomology, i.c., H"(f) and H”(g) are the same map of H”(X) into 
H" (X’) for each n. 


PROOF. Let A be a homotopy, and suppose that d,(x,) = 0. Then the com- 
putation fn (%n) — 8n(Xn) = Ansidn(Xn) + d)_jhn(Xn) = 0+ d)_ hn (Xn) shows 
that the images of x, under f,, and g, in X), differ by a member of image d’_,. 


A chain or cochain complex written neutrally as X = {X (n)} is exact at X (n) 
if the kernel of the outgoing map at X (n) equals the image of the incoming map 
at X(n) (as opposed to merely containing the image). The complex is exact, or 
is an exact sequence, if it is exact at every X(n). A short exact sequence is an 
exact sequence of the form 


sath epS oso 


understood to have 0’s at all positions beyond each end. The conditions on the 
5-term complex above for it to be exact are that g be one-one, y be onto C, and 
that y exhibit C as isomorphic to B/image gy. To make the terminology more 
symmetric, it is customary to introduce a name for the quotient of the range of a 
homomorphism 7 by the image of 7; this quotient is defined to be the cokernel 
of the homomorphism and is denoted by coker. The conditions for exactness 
above can then be restated more symmetrically as 


ker gy = coker y = 0 and image g = kery. 


An exact sequence can always be broken into short exact sequences by stretch- 
ing each link 


ae ee eee 


into 
.> A-*s imageg > 0 > 0 > ker 3S BS... 


and breaking it between the 0’s; here “inc” denotes the inclusion mapping of 
ker w into B. This stretching process does not take us outside our good category, 
since good categories are assumed to be closed under passage to submodules and 
quotients. Conversely if we have two exact sequences 


A 0 - vad: ens ee 


176 IV. Homological Algebra 


then we can combine them into an exact sequence 


aR es, 


Exactness at A of the merged sequence follows because ker(ig) = kerg, and 
exactness at B follows because ker w = imagei = image(i¢). 

Any map g : A — B in our good category can be expressed in terms of an 
exact sequence by including the kernel and cokernel: 


y q 


0 — kerg SAB 


> cokery — 0; 


herei : kerg — Ais the inclusion, andg : B — coker gis the quotient mapping. 
All the modules and maps in the exact sequence are in the category, since good 
categories are assumed to be closed under passage to submodules and quotients. 
We shall use the following special case of this observation in Section 3. 

Proposition 4.2. Let X = {(Xn, 0n)}2_., be a chain complex in a good 
category with 0, in Homr(X,+41, Xn) for each n. Then the boundary operator 
dn—1 On X, descends to the quotient as a mapping dn-1 1 coker 0, —> ker 0,—2 
and yields an exact sequence 


Os HO) 5 comer dy 2 herd = - i) SS 6: 


Here i is the inclusion i : ker 0,_1/ image 0, — X,/ image 0,, and q is the quo- 
tient g : ker 0,2 — ker d,_2/ image 0,_1. This association of a six-term exact 
sequence to X for each n is functorial in the sense that if X’ = {(X/,, 0 )}P2_.. is 
a second chain complex and if f : X — X’ is a chain complex, then the diagram 


HAX). —s eoerd, SS herds —Ss ay) 


t | : + 


dh 


H,(X’) —> cokera_ —=> kera’_, —2> Hy-1(X’) 


commutes; here the vertical maps are those induced by fn—1 and fh. 


REMARKS. 

(1) The term “functorial” in the statement has a precise meaning in this and 
other contexts. Each chain complex is being carried to a 6-term exact sequence 
for each n. The chain complexes and the 6-term exact sequences both form 
categories, the morphisms in each case being chain maps. To say that the passage 


2. Complexes and Additive Functors 177 


from the objects of one category to the other is functorial is to say that the 
passage between the categories is actually a functor, i.e., chain maps for the chain 
complexes are sent to chain maps for the 6-term exact sequences, the identity goes 
to the identity, and compositions go to compositions. The latter two conditions 
are evident, and what needs proof is that chain maps are carried to chain maps.* 


(2) For acochain complex X = {(Xn, dn) }72._,, with d, in Homr(Xn, Xn41), 
the corresponding exact sequence is 


0 —> Hy—1(X) > coker dy_2 “=> kerd, 4 H,(X) — 0, 


and it is functorial with respect to cochain maps. 


PROOF. To see that the map dn—1 catries coker 4, to ker ,_>, we write it as a 
composition 


coker 0, = X;,/ image 0, — X,/ ker 0,_; = image 0,_; C ker d,_2, 


with the arrow induced by the inclusion imaged, C ker d,_; and with the iso- 
morphism induced by applying 0,_; to X,, and passing to the quotient. Then we 
have ker 0, = ker 0,_;/ image 0, = H,(X) and 


coker 0,1 = ker dn) Oni (X,,/ image 0,,) = ker Dn—2/ On—1 Xn 
= ker 0,—2/ image 0n-1 = Hn—-1(X), 


and the exactness of the sequence is a special case of the exactness noted in the 
paragraph before the proposition. 

For the assertion that the association is functorial, the left square commutes 
because the verticals are both induced by the same map f,,, and the right square 
commutes because the verticals are both induced by the same map /,_1. For the 
middle square the commutativity follows from the fact that fn—19,—1 = 9); fn- 


4Some authors use the word “natural” instead of the word “functorial” in this situation. Authors 
who do this may have the notion of “natural transformation” between two functors in mind, or they 
may not. For those who do not, it seems advisable to use a different term like “functorial” to avoid 
confusion. For those who do, the allusion to a natural transformation is at best tortured in this 
instance. A natural transformation refers to two categories C and C’, and the most intuitive choice 
for C here is the category of chain complexes X. There are to be two functors from C to C’ and the 
natural transformation relates the values of those functors on X, for each X; no second complex X’ 
enters into matters. To have X’ involved in a natural transformation would mean including at least 
two chain complexes in each object of C. In other instances, however, some additional structure 
may be present. Then the distinction between “functorial” and “natural” may be one of emphasis 
concerning the data. The statements of Propositions 4.29 and 4.30 below provide examples. 


178 IV. Homological Algebra 


As was mentioned in Section 1, our interest will be in functors F : C > C’ 
between two good categories, not necessarily involving the same ring, with the 
property of being additive. This means that F (gy; + g2) = F(g1) + F(@2) when 
gy and @ are in the same Homg(A, B). 

An additive functor sends any 0 map to the corresponding 0 map. Consequently 
it always sends complexes to complexes. Moreover, since any functor carries the 
identity map of each Homp(A, A) to an identity map, an additive functor has to 
send any module A for which the 0 map and the identity coincide to another such 
module. The 0 module is the unique module A with this property, and thus an 
additive functor has to send the 0 module to a 0 module. 

Moreover, additive functors carry finite direct sums to finite direct sums. (Re- 
call that good categories are closed under finite direct sums.) This fact needs 
proper formulation, and we need first to express direct sums in terms of modules 
and maps. From the point of view of category theory, we shall take advantage 
of the fact that for left R modules, product and coproduct coincide and are given 
by direct sum. If C = A © B, then there are thus projections py : C — A and 
Pp: C — B and injections i, : A > C andig : B —> C such that 


PAta=1a and PBB = IB, 
Ppla =O and Palg = 90, 


and 

Lapa + lepp = Ic. 
Conversely if we have maps pa, la, pg, and tg with these properties, then the 
modules A = image p, and B = image pz have the property that C is the internal 
direct sum C = t4A @ tpB, and t, and Ug are one-one. In fact, the equation 
LAPA +lppsB = lc shows thati,A +1gB = C. To see that rzA NigB = 0, 
let x be in the intersection. Then pgx lies in ppt, A, which is 0, and pax lies in 
PaleB, which is 0. Thus t4pa + lgpp = Ic givesO = tapax +lgppx =X. 
Hence 4A NigB =O andC = 14A OlgB. Finally the equations pala = 1, 
and pglg = 1g imply that ¢, and vg are one-one. 

With direct sum now expressed in terms of modules and maps, let us return to 
the effect of additive functors on direct sums. Let C = A@ B, and let pa, pp, ta, 
and ig be as above. Suppose that the additive functor F is covariant. Applying F 
to the displayed identities in the previous paragraph and using that F is additive, 
we see that F (pa), F(pa), F(ta), and F (tg) have the properties that allow us to 
recognize a direct sum. Hence 


F(C) = F(ta) F(A) © F(a) F(B) 
with F(1,4) and F (tg) one-one. Thus 


F(C) = F(A) @ F(B). 


2. Complexes and Additive Functors 179 


If instead F is contravariant, then the roles of the projections and the injections 
get interchanged, but we still obtain F(C) = F(A) @ F(B). 

An additive functor F : C > C’ between two good categories is exact if it 
transforms exact sequences into exact sequences. Proposition 4.3 below will show 
that exact covariant functors preserve kernels, images, cokernels, submodules, 
quotients, and more. However, exact functors occur only infrequently; we shall 
see a few examples of them in Section 4. For examples of failures at exactness, 
it was shown in Section X.6 of Basic Algebra that if 


g v 


OoO- M->N>P-—O0 


is a short exact sequence in the category Cr, if E is a unital left R module, and if 
E’ is a unital right R module, then the following sequences in Cz are exact: 


Ei @pM SB @pN — > F eg P — 0, 


Hom(1,W) 


ae, Home (E, N) “>. Homa (E, P), 


0 —~ Homa(E, M) 


Hom(¢, 1) Hom(¥, 1) 
<—— 


Homa(M, E) Homa(N, E) ——-— Homa(P, E) <— 0; 


on the other hand, the extensions of these complexes to 5-term complexes by the 
adjoining of a0 need not be exact, and thus the functors E’ ®z (-), Homr(F£, -), 
and Home(-, £) are not exact for suitable choices of R, E, and E’. 


Proposition 4.3. An additive functor F : C > C’ between two good categories 
is exact if and only if it carries all short exact sequences into short exact sequences. 


REMARK. This proposition makes it a little easier to test concrete additive 
functors for exactness than it would be from the definition. 


PROOF. Necessity is obvious. For sufficiency, let 
ABS 
be exact, and let the additive functor F be covariant, the contravariant case being 
completely analogous. Put A; = kerg, Bj = kerw, and C; = image y. Since 
wo = 0, we can factor g as g = ¢2¢1, where gy, : A > B, is g with its range 
space reduced and where @2 : B; — B is the inclusion. Similarly we can factor 


was w = WoW, where yy, : B —> C, is ¥ with its range space reduced and 
where wz: Cy — C is the inclusion. Of the sequences 


0— A; >A > B, — 0, 


02358) pe Cie, 


OG Ges C1; =: 


180 IV. Homological Algebra 


the first and the third are trivially exact, and the second is exact because ker yr = 
kerw = imageg = image@. The hypothesis that F carries short exact se- 
quences to short exact sequences thus implies that the three sequences 


PALS Fah KS 6 


RB) SHG) = SG), 


CRG FC) 


are exact. From these, ker F(wW,) = image F(¢2). Also, F(wW2) is one-one, so 
that 


ker F (Wr) = ker (F(W2) F (Wi) = ker FW), 
and F'(¢1) is onto, so that 
image F (2) = image (F (v2) F (v1) = image F(g). 
Hence ker F (yr) = image F(g), and 


F(A) —2 FB) > FO 


is exact, as required. 


Proposition 4.4. Let F : C — C’ be an additive functor between good 
categories, let X be a complex in C, and let F(X) be the corresponding complex 
in C’. If F is exact, then F carries the homology or cohomology of X to the 
homology or cohomology of F(X). 


REMARKS. Our convention is to refer to homology when the indexing goes 
down and cohomology when the indexing goes up. If F is covariant, it preserves 
the indexing, while if F is contravariant, it reverses it. For the proof we shall use 
notation A, B,C for modules that is neutral with respect to the indexing. The 
arguments are qualitatively different in the covariant and contravariant cases, and 
we shall give both of them. 


PROOF IN THE COVARIANT CASE. Let 
Eee ea 
be a given complex, thus having yg = 0, and form the image complex 


FA RRs RC), 


2. Complexes and Additive Functors 181 


We are to exhibit an isomorphism 


F (ker y/ image g) = ker F(W)/ image F(g). (*) 
Let i : imageg — kery and j : kery — B be the inclusions, and let 
q : kerw — kery/imageg be the quotient map. Applying F to the exact 
sequence 
0 —> image g ee ker y at, ker y/ image g —> 0 
and using exactness, we obtain an isomorphism via F(q): 
F (ker y/ image g) = F (ker W)/F (i) F Gimage Q). (+) 


Since j is one-one and F is exact, F(j) is one-one. Thus application of F(j) to 
the right side of (*) gives 


F (ker y/ image gy) = F(j) F (ker w)/F (ji) F (image g). (1) 
If @ denotes ¢ with its range reduced to its image, then g = jig. Applying F to 
the two exact sequences 
ker y AB “ GC; 
Acts image py — 0 


gives us F(j)F (ker) = ker F(w) and F(imageg) = F(@)F(A). Applying 
F (ji) to the second of these and substituting both into the right side of (+) 
transforms (+) into (*) and gives the required isomorphism. 


PROOF IN THE CONTRAVARIANT CASE. Let 
Aa eG 
be given with yg = 0, and form the image complex 


FA FB) FC), 


We are to exhibit an isomorphism 


F (ker y/ image g) = ker F(y)/ image F (w). (*) 
Let j : ker — B be the inclusion, let j : ker y/imageg — B/ image w be 
the induced map between quotients, and let g, q’, g” be the quotient maps 
q:B— B/kery, 
q': B > B/imageg, 
q’ : B/imageg > B/ker vw. 


182 IV. Homological Algebra 


These satisfy g = q"q'. Applying F to the exact sequence 


0 —> ker w/ image g ei, B/ image * B/kery — 0 
and using exactness, we obtain an isomorphism via F (j): 
F (ker y/ image y) = F(B/image y)/F (q")F (B/ker p). (**) 


Since q’ is onto and F is exact, F(q’) is one-one. Thus application of F(q’) to 
the right side of (+) gives 


F (ker / image g) = F(q') F (B/ image ) / F(q) F(B/ ker). (1) 
Applying F to the three exact sequences 
A—>B a B/ image ¢, 
ker y 4, B A C, 
kery > BS B/ker 
gives us F(q') F (B/ image g) = ker F(y) and F(q) F(B/kery) = ker F(j) = 


image F (yy). Substituting both these equalities into the right side of (+) trans- 
forms (+) into («) and gives the required isomorphism. 


We were reminded before Proposition 4.3 that Home and ® need not yield 
exact functors. The partial exactness that they exhibit, as opposed to exactness 
itself, is more typical of additive functors, and we incorporate this behavior into 
two definitions. We shall define left and right exactness in such a way that Home 
is left exact in each variable and @ is right exact. An additive functor F is left 
exact if the exactness of 


aay Ee 2 ce 


implies the exactness of 


0 — F(A) ©, F(a) 2 FCC) (F covariant), 


0 —> Fic) 2% F(B) 2, F(A) (F contravariant). 


2. Complexes and Additive Functors 183 


We say that F is right exact if the exactness of the sequence with 0, A, B,C, 0 
above implies the exactness of 


FQ) Fw) 


F(A) > F(B) > F(C) — 0 (F covariant), 


F(C) Es F(B) F(g) 


> F(A) — 0 (F contravariant). 


The words “left” and “right” refer to the part of the target sequence that is exact 
when the arrows are arranged to point to the right. A consequence (but not the 
full content) of these definitions in each case is an assertion about one-one or 
onto maps. For example a left exact covariant F carries one-one maps to one- 
one maps; we have only to start from a one-one map g : A — B and set up a 
short exact sequence with C = B/ image g, and the definition shows that F(g) 
is one-one. 


Proposition 4.5. If F is a covariant left exact functor, then F’ carries an exact 
sequence 


CsA SPAS 


into an exact sequence 


0 — F(A) ©» F(B) £% FC). 


REMARK. The expected analogs of this result are valid if F is contravariant or 
if F is right exact or both. 


PROOF. Starting from the given exact sequence, let i : image y — C be the 
inclusion, and let wy : B — image y be w with its range space reduced. Then 
Ww = iw, and the sequences 


0 >A—~>B v . image y —> 0 


and 0 —> image y —.C —> C/image y —> 0 
are exact. Applying F and using its left exactness, we see that 


0 —> F(A) > F(B) * Foimage w) 


and 0 —> F(image pv) % FC) 


are exact. Thus F(i) is one-one, and F(w) = F(iw) = F(i)F(q) has the 
same kernel as F(y). The exactness of the first image complex shows that 
ker F(y) = image F(g), and the proof of the required exactness is complete. 


184 IV. Homological Algebra 


3. Long Exact Sequences 


As in Section 2, let C be a good category. We have seen that chain complexes 
in C themselves form a category whose morphisms are chain maps. If we have 
several chain maps in succession, each with an index n € Z, we can say that 
they form an “exact sequence” of chain maps if for each n, the sequences of 
modules and maps having index n form an exact sequence in C. Our objective 
in this section is to show that any short exact sequence of complexes of this kind 
yields a “long exact sequence” of modules and maps in C involving all the indices. 
More precisely we are able to construct for each n a “connecting homomorphism” 
relating? what happens with each index n to what happens for index n +1 orn—1 
and incorporating modules and maps for all indices into a single exact sequence 
of infinite length. 


By way of preparation for the construction of connecting homomorphisms, let 
us be more explicit about the discussion in Section 2 of how a chain map carries 
the homology of one complex to the homology of another complex. Let 


Aja 


k of 


, 


A’ be B’ 


be a commutative diagram in the good category C. Let us observe that g(kera@) C 
ker 6; in fact, any a € kera has 0 = g’a(a) = Be(a), and thus g(a) is in ker B. 
Let us observe further that g’(a@(A)) = B(g(A)) © B(B); since ¢’ carries A’ into 
B’, it follows that y’ descends to a mapping @’ defined on A’/a(A) = cokera 
and taking values in B’/8(B) = coker 8. We can summarize these remarks by 
the inclusions 


y(kera) C ker B and @ (coker a) C coker f. 


Using these remarks, we can now construct a “connecting homomorphism” when- 
ever we have a diagram as in Figure 4.1 below. 


>For readers familiar with the use of homology in topology, connecting homomorphisms arise 
when one works with the homology of a topological space, the homology of a subspace, and the 
relative homology of the space and the subspace; the construction in this section may be regarded 
as an abstract version of that construction. 


3. Long Exact Sequences 185 


A > B > C > 0 
J) ofp 
0) > A’ e > B’ v > C’ 


FIGURE 4.1. Snake diagram. The rows are assumed exact, and the squares 
commute. In this situation the Snake Lemma constructs 
a connecting homomorphism @ : ker y — cokera. 


Lemma 4.6 (Snake Lemma). In a good category C, a snake diagram as in 
Figure 4.1 induces a homomorphism w : ker y > cokera@ with 
ker@ = wiker 6) and image w = vy’ '(image B)/imagea, 
and with w(c) = g’~!(B(W!(©))) + image a for c € ker y, and then 


— 


kera > ker B un ker y °. cokera > coker 8 Eas coker y 


is an exact sequence. Here @ and w are restrictions of y and w, and @ and Vv 
are descended versions of gy and w. If gy is one-one, then @ is one-one. If w’ is 
onto C’, then Vv is onto coker y. 


REMARKS. The homomorphism wm is called a connecting homomorphism. 
The name “Snake Lemma” comes from the pattern that the six-term exact se- 
quence makes when superimposed on the enlarged version of Figure 4.1 shown 
in Figure 4.2. 

kera —— ker6B —— kery 


A — B —_ C —> 0 
0—>-—s«WC 4A ———s iW BW — >) 


cokera ———> coker 6 ——— cokery 


FIGURE 4.2. Enlarged snake diagram. 


186 IV. Homological Algebra 


PROOF. First let us construct w and see that it is well defined. Let c be in 
ker y. Since y is onto C, write c = w(b) for some b € B. The commutativity of 
the second square in Figure 4.1 gives 0 = y(c) = yw(b) = W’(Bb). Thus B(b) 
is in ker y’ = imageg’, and B(b) = g'(a’) for some a’ € A’; the element a’ is 
uniquely determined, since gy’ is one-one. Define w(c) = a’ + a@(A). 

The only choice in this definition is that of b, and we are to show that any 
other choice leads to the same member of coker a. If b is another choice and if 
B(b) = v'(@) witha’ € A’, then y(b — b) = c—c = 0 shows that b — b = g(a) 
for some a € A. Thus g'(a’ — a’) = B(b — b) = By(a) = v'(a(a)). Since g’ is 
one-one, a’ — a’ = a(a), and a’ and a’ are exhibited as in the same coset of A’ 
modulo @(A). 

Let us compute kerw. Suppose that w(c) = 0, i.e., that w(c) is in a(A). 
Say w(c) = a(a). By construction of w, w(c) = a’ + a(A) for an element 
a’ € A’ such that B(b) = g’(a’) andc = w(b). In this case, a’ = a(a). 
So B(b) = g’a(a) = Be (a), and thus b — ¢g(a) is in ker 8. Consequently 
c = w(b) = wb) — We@) is in W(ker B), and kerw C wW(kerf). For the 
reverse inclusion, if c is in y(ker B), choose b € ker 6 with y(b) = c. Then 
y(c) = yw(b) = w'B(b) = 0 shows that a(c) is defined. Since c = w(b), the 
construction of w shows that B(b) = g’(a’) for some a’ € A’. Since b is in ker B 
and since g’ is one-one, this a’ must be 0. Then w(c) = a’ + a(A) =0+a(A), 
cis in kera, and y (ker B) C kerw. 

Now we compute imagew. Our step-by-step definition of w shows that 
image w C g’~! (image B)/a(A). For the reverse inclusion, suppose that a’ € A’ 
is in g’—| (image B), i.e., has y’(a’) = B(b) for some b € B. Then the element 
c = g(b) of C has y(c) = yw(b) = WB) = W'9'@’) = 0, and w(c) 
is therefore defined. Our definition of @ makes w(c) = a’ + a(A), and thus 
g’—' (image B)/a(A) C image a. 

We are left with establishing the exactness of the displayed sequence of six 
terms at the four positions other than the ends and with proving the two assertions 
in the last sentence of the lemma. 

The condition of exactness at ker # is that g(ker~) = kerw NM kerf. The 
inclusion C follows from the equalitiesO = wg and By(kera) = g’a(kera) = 0. 
For the inclusion >, let b € B satisfy w(b) = B(b) = 0. Exactness at B gives 
b = g(a) witha € A. Then 0 = £(b) = By(a) = ¢'a(a) with gy’ one-one 
implies that a(a) = 0, and a is in kera. Thus b is in g(ker a), and exactness at 
ker B is proved. If g is one-one, then certainly its restriction @ is one-one. 

The condition of exactness at ker y is that kerw = w(ker #), and this was 
proved in the third paragraph of the proof. 

By the result of the fourth paragraph, the condition of exactness at coker a 
is that g’~'(B(B))/a(A) equal ker’, where @’ : A’/a(A) — B’/B(B) is the 
map induced by gy’. The members of ker@’ are those cosets a’ + a(A) with 


3. Long Exact Sequences 187 


y'(a’ + a(A)) C B(B). Since g’a(A) = Bo(A) © B(B), the condition on 
a’ + a(A) is that g’(a’) be in B(B), hence that a’ be in y’~!(B(B)), hence that 
the coset a’ + (A) be in g’~!(B(B))/a(A). Thus we have exactness at coker a. 

At coker B, we know that the descended map @’ maps coker a into coker B, and 
we are to show that @’ (coker a) = ker Vv. Inclusion C follows because y’y’ = 0 
implies ¥ 9 (a! +a(A)) = ¥ (y'(a') + B(B)) = W'e'(a’) + (C) = y (C). For 
the reverse inclusion let b’ € B’ have v (b! + B(B)) = y(C). Then w’(b’) is in 
y(C). Since w : B — C is onto, we can find b € B with w'(b'!) = yW(b) = 
w’B(b). Hence b’— B(b) is inker w’ = image g’, and b’— B(b) = ¢’(a’) for some 
a’ € A’. Consequently b’ + B(B) = g(a’) + B(b) + B(B) = ¢'(a) + B(B) = 
(y’),(a’ + a(A)), and b’ + B(B) is exhibited as in (g’),(a’ + a(A)), ie., in 
(y’),.(coker wv). Thus we have exactness at coker 8. Finally if w’ is onto C’, then 
certainly its descended map Vv is onto coker y. This completes the proof. 


Theorem 4.7. Let A = {(An, an)}, B = {(Bn, Bn)}, and C = {(Ch, Yn)} be 
chain complexes in a good category C, and suppose that g = {g,}: A — B and 
w = {W,}: B— C are chain maps such that the sequence 


g yv 


0O—-A—>B—>C—O0 


of chain complexes is exact. Then this exact sequence of chain complexes induces 
an exact sequence in homology of the form 


bos Po s(Cy os Ay Bs Cy 8 tA Sk: 


Here the map @, : Hn+i1(C) — H(A) has descended from the connecting 
homomorphism @, defined on kery, in Cy41 and having range cokera, = 
An/ image ay. 


REMARKS. 

(1) The exact sequence in homology is called the long exact sequence in 
homology corresponding to the short exact sequence of chain complexes, and the 
maps w, are called connecting homomorphisms. As the proof will show, these 
connecting homomorphisms arise by two applications of the Snake Lemma, not 
just one. 


(2) In more detail the diagram of the short exact sequence of chain complexes 
is of the form 


188 IV. Homological Algebra 


oe |: 


OA es ee Oe 0 


[en [1 ia 


Qo Ee Fa 


a ae 


The rows are exact, the columns are chain complexes, and the squares commute. 


(3) The corresponding result for cochain complexes involves the diagram 


Qn Bu Yn 
Pn Vn 
0o— A, — 3B, — Cc, — 0 
n n n 
Qn-1 Bn-1 Yn-1 
Pn-1 Wn-1 


and the corresponding long exact sequence in cohomology is 


@n+1 


o> Hl) 2 H(A) Ps HB) > HC) Sts HOTA), 


The result for cochain complexes is a consequence of the result for chain com- 
plexes and follows by making adjustments in the notation. 


3. Long Exact Sequences 189 


PROOF. We regard the top two displayed rows of the diagram in Remark 2 as a 
snake diagram. Applying the Snake Lemma (Lemma 4.6), we obtain a connecting 
homomorphism w, and an exact sequence 


_ 


Pnt1 Wrst o) G, vi 
kera, —“> ker 8, —> ker y, —> cokera, —> coker 8, —> coker yp. 


Using Proposition 4.2 for each of the chain complexes A = {(An, @,)}, B = 
{(Bn, Bn)}, and C = {(Cn, Yn)}, we see that we obtain a diagram 


0 0 0 
H,(A) A, (B) A, (C) 


Vv Vv Vv 


@ v 
cokera@, —> cokerB, — > cokery, —> 0 


An—1(A) Ay—1(B) An-1(C) 
0 0 0 


in which the rows and columns are exact and the squares commute. The third 
and fourth rows form a snake diagram, and the second and fifth rows identify the 
kernels and cokernels. Thus the Snake Lemma gives us an exact sequence 


Pn Tn 2 Pn Vi, - 

Hy (A) = Hy(B) > Hy(C) > Hy-1(A) > Hy-1(B) > An-1(©) 
for a suitable connecting homomorphism &. Repeating this argument for all n 
proves exactness at all modules of the long exact sequence. 

To complete the proof, we have only to identify (2. Reference to the statement 
of the Snake Lemma shows that the formula for Q is 


Q = G1)! BriG, O)) + imageg,_1 


for c € H,(C). Meanwhile, the connecting homomorphism from the first appli- 
cation of the Snake Lemma is @,_1(c) = (gi _,)! (Bn—1 (wr! (c))) +image a,—1 
forc € ker yp_1. Thus Q(c + image y,) = @n—1(c) + image @,_1 as asserted. 


190 IV. Homological Algebra 


Corollary 4.8. If 


Os AS pS oc 0 


is an exact sequence of chain complexes in a good category and if A is exact, 
then H,,(B) = H,(C) for all n; if instead C is exact, then H,(A) = H,,(B) for 
all n. Consequently if any two of the three chain complexes are exact, then the 
third one is exact. 


PROOF. Theorem 4.7 gives the long exact sequence 


»++—> Ay41(C) — HA, (A) — A, (B) — A, (C) — An-1(A) >: :- 


If H,(A) = Oand H,,_(A) = 0, then we see that H,(B) = H,(C). If Hn4i(C) = 
0 and H,,(C) = 0, then we see that H,,(A) = H,,(B). 

If two of the three chain complexes are exact, then one of the two is A or C, 
and the result in the previous paragraph applies. Then the other two complexes 
(B and C, or A and B) have isomorphic homology. The hypothesis says that one 
of these two sequences of homology groups is 0. Therefore the other one is 0. 


To conclude the discussion, we shall prove results saying that the exact se- 
quences produced by Lemma 4.6 and Theorem 4.7 are functorial. 


Lemma 4.9. In a good category C, the six-term exact sequence that is obtained 
from a snake diagram as in Figure 4.1 is functorial in the following sense: If there 
are two horizontal planar snake diagrams, one with tildes (~) over all modules 
and maps and the other as is, and if there are vertical maps fa, etc., in three 
dimensions from the tilde version of the snake diagram to the original version 
such that all vertical squares commute, then the squares of the diagram 


~ @ ~ wv mo ~ @ ~ wy a 
kera —> kerB —> kery —> cokera —> coker8 —~> cokery 


la lis lz [ie lie lie 


/ 


kera ay ker B av. ker y —. cokera > coker B —> cokery 


all commute. 


PROOF. For the first square from the left, the assumed commutativity shows 
that fa = af,, and thus x € ker@ implies f4(x) € kera; similarly x € ker B 
implies fz(x) € ker 8. Thus the maps of the square are well defined. We are 
given also that ef, = fs@, and this proves that the square commutes. The second 
square from the left is handled similarly. 


3. Long Exact Sequences 191 


For the fourth square from the left, the equation f4a@ = af, shows that 
y = a(x) implies fa(y) = a(fa(x)), and thus y € imagea@ implies f4(y) € 
image @; this means that f'4 descends to a map fa of coker @ to coker a. Similarly 
fpr descends to a map fp of coker B to coker 8B. Thus the maps of the square 
are well defined. We are given also that y’ f4, = fp’, and this proves that the 
square commutes. The fifth square from the left is handled similarly. 

We are left with the third square from the left. The map at the left side of this 
square was shown to be well defined in the first paragraph of the proof, and the 
map at the right side of this square was shown to be meaningful in the second 
paragraph of the proof. We are to prove that the square commutes. Referring to 
the ¢ eae of @, let C be in ker 7, choose b in B with w(b) = C, and write 
Bb) = = g'(a’). Then @(C) is defined to be the coset of a’. Using the assumed 
pees we compute that Wf, (b) = few (b) = fc(©) and that 


g' fa@’) = fe @'@) = fa BO) = Bia). 


Thus fp (b) is an element whose image under w is fc(C), and 6 of this element 
is y' f4(a’). Consequently the coset of w(fc(C)) is to be the coset of f4/(a’) = 
fa'@(c). This proves the desired commutativity. 


Theorem 4.10. In a good category C, the long exact sequence that is obtained 
from a short exact sequence of chain complexes as in Theorem 4.7 is functorial 
in the following sense: if there are two short exact sequences of chain complexes 
as in the theorem, one with tildes (~) over all modules and maps and the other 
as is, each viewed as lying in a horizontal plane, and if there are vertical maps 
fa,» etc., from the tilde version of the exact sequence of chain complexes to the 
original version such that all vertical squares commute, then the squares of the 
diagram 


On Vn @n— 1 


—_— Hyai(C) — HA, (A) aaa A, (B) —> H, (C) —> A, (A) —_— 


om Pe [tm a [fn 


— Anii(C) A, (A) ia A, (B) ey A, (C) ss Ay,-1(A) — 


all commute. 


PROOF. Theorem 4.7 was proved by three applications of Proposition 4.2, 
which includes its own assertion of functoriality, and two applications of Lemma 
4.6, whose functoriality is addressed in Lemma 4.9. The argument involved only 
manipulations with diagrams, and functoriality is in place for every step. Hence 
functoriality is in place for the end result, and passage to the long exact sequence 
is functorial. 


192 IV. Homological Algebra 
4. Projectives and Injectives 


In Section II.5 we exploited the fact that certain complexes were exact and 
involved free modules in order to obtain chain maps and homotopies. The 
hypothesis “free” entered the arguments through Propositions 3.25 and 3.27; 
in both cases an R homomorphism was to be constructed from a free R module 
to some other R module, and a computation revealed how the R homomorphism 
should be defined on free generators. The universal mapping property of free 
modules allowed the R homomorphism to be extended from the generators to the 
whole free module. Examination of those arguments shows that it is enough to 
assume that the domain on which this R homomorphism is to be constructed is a 
“projective” R module, in the sense to be defined below, and we begin with that 
notion. 

Let C be a good category of unital left R modules. We say that a module P in 
this category is projective in C or is a projective in C if whenever a diagram in 
the category is given as in Figure 4.3 with w mapping onto B, then there exists 
o : P — C inC such that the diagram commutes. 


P 


T Se Oo 
uy 


0 < Bie C 


FIGURE 4.3. Defining property of a projective. 


If P is a free R module in C, then P is projective in C. In fact, for each free 
generator x of P, we choose an element c, in C with w(c,) = t(x). Then we 
define o(x) = c, and extend o to a homomorphism. We give further examples 
of projectives shortly. First let us establish in Lemma 4.11 an ostensibly stronger 
property that projectives automatically satisfy. 


Lemma 4.11. If P is projective in the good category C and if the diagram 
P 


T “S56! 
u 


ay RO Ly 


A’ < 
in C has kerg = image y and yt = 0, then there exists amapo : P > A” inC 


such that the diagram commutes. 


PROOF. The hypotheses force imaget C kerg = image y. Thus if we put 
B = image w andC = A”, then the above diagram leads to the diagram in Figure 


4. Projectives and Injectives 193 


4.3. The hypothesis “projective” therefore gives us the map o in Figure 4.3 with 
Tt = wo, and the same o is the required map here. 


EXAMPLES OF PROJECTIVES. 


(1) If R is a field F and if C is the category of all vector spaces over F’, then 
every module is free, hence projective, since every vector space has a basis. 


(2) For general R, if C = Cp is the category of all unital left R modules, 
then the projectives are the direct summands of free modules. This fact is easily 
verified from Figure 4.3 as follows: In one direction if F = P @ P’ is a free 
R module and the diagram in Figure 4.3 is given, extend t to F as 0 on P’, 
find o from the fact that the free module F is projective, and restrict o to P. 
In the other direction if P is projective, find a free R module F mapping onto 
P by a map wy, and put B = P,C = F, andt = 1 in Figure 4.3. Then the 
equality 1p = t = Wo forces o to be one-one, and it follows that P = imageo. 
Consequently F = imageo @ ker y. 


(3) For R = Z, the category C = Cz of all unital R modules is the category 
of all abelian groups. Then the projective modules are the free abelian groups by 
(2), since any subgroup of a free abelian group is free abelian. 


(4) For R equal to any (commutative) principal ideal domain, the projective 
modules in the category Cer of all unital R modules are the free modules, by 
the same argument as in (3) in combination with the Fundamental Theorem of 
Finitely Generated Modules (Theorem 8.25 of Basic Algebra). 


(5) For R = Z, two good categories that were listed in Section 2 were the 
category of all finitely generated abelian groups and the category of all torsion 
abelian groups. With the first of these, the projectives are the free abelian groups 
of finite rank, by the same argument as in (3). With the second of these, Problem 1 
at the end of the chapter asks for a verification that some module in the category 
fails to be the image of any projective in the category. 


We come to the main result concerning flexibility in setting up chain complexes. 
This result generalizes Proposition 3.25 through Corollary 3.30 in Section II.5. 


Theorem 4.12. Let X = {(Xn, dn)}72 4, and X’ = {(X),, 0/)}°2 _.. be chain 
complexes in the good category C, and let be an integer. Let { f, : Xn > X) }n<r 
bea family of maps inC such that 0), fn = fn—19n-1 forn <r. If Xp is projective 
forn > r and X’ is exact at each X), withn > r, then { f, : X, > Xj Jn<, extends 
toa chain map f : X — X’, and f is unique up to homotopy. More precisely 


any two extensions are homotopic by a homotopy / such thath, = 0 forn <r. 


REMARKS. The diagrams in question are 


194 IV. Homological Algebra 


On+1 0, On-1 On—2 
~—— Xn “+ X, —— X,-1 —— 
i} 
Sn+1 [p [ha 
Vv 
On ay dn dno 
td n / n / n 
ae aay pie Me OS ti 


for the construction of the chain map and 


0, 1 0 0, -1 
, ? Xn42 _ ? Xn41 “> Xp —> Xn-1 comico 
| ra | 
{fe vo Angi [ie VAs [« Jie [a 
e / v 
nl f a4 
/ n ! n ! n— y 
: X42 Xana 2 Xx), - XI 


for the construction of the homotopy. 


PROOF. For the existence of the chain map, it is enough by induction to 
construct f,4,. Matters are therefore as in the first of the above diagrams with 
n =r. Since X’ is exact at X; and X,4, is projective, we are in the situation 
of Lemma 4.11 with P = X,41, A” = Xpa AS Kh SX as UO 
gy = 0,_,,andt = f,0,. The lemma gives amapo : P > A” with yo = Tt. 
If we take f.4; = 0, then yo =T says that 0) f-4, = f,0,, and the inductive 
construction of the chain map is complete. 

For the uniqueness up to homotopy, let f : X > X’ andg : X > X’ 
be two chain maps such that f, = g, forn <r. Define h, : X, > Xia 
to be 0 for n < r, and observe that the system of functions {h,},<, satisfies 
hn—10n—-1 + On = fn — Bn forn <r because f, = gn forn <r. Proceeding 
inductively, suppose that s > r and that h, has been constructed forn < s such that 
hn—19n—-1 + O,hn = fn — 8n forn < s. We are to construct hs41 : Xs41 > X{45. 
This is the situation of the second diagram above withn = s. Since s > r, X’ 
is exact at X/,, and X,4 is projective. Thus we are in the situation of Lemma 
4.11 with P = X541, A” = Xi4o A= Deere A=Xi,v= ee gy = 9, and 
T = (fs41 — 8s+1) — As0s. The lemma gives a mapo : P > A” with wo =T. 
If we take hs41 =o, then yo = T says that 04 1/5 +1 = (fs+1 — S541) — As 9s, 
and the inductive construction of the homotopy is complete. 


A resolution in the category C is an exact chain complex X = {(Xn, 0n)}2_,, 


or cochain complex X = {(Xy,, d,)}°o such that X,, = 0 forn < —2. We say 


n=—C 
that the complex is a resolution of X_;, and we abbreviate it as 


1 


4 a_1 + d_ 
X= (XT ——> X11) or K=(X* <— X_), 


4. Projectives and Injectives 195 


with X* referring to 


4+ a2 a1 do 
>, ae ese > X> > X| > Xo 


or xX": reek X> < 


in the respective cases. A chain complex X = (X* —+ M) that forms a 


resolution is called a free resolution of M if every X,, forn > 0 is a free module. 
It is called a projective resolution of M if every X,, for n > 0 is projective. 


Corollary 4.13. Let M be a module in a good category C and let 
X=(Xt—>M) and X’ =(X'+ > M) 


be two projective resolutions of M. Then there exist chain maps f : X > X’ 
and g: X’ > X with f_; = ly and g_, = 1 y, and any two such chain maps f 
and g have the property that gf : X — X is homotopic to ly and fg : X' > X’ 
is homotopic to 1y-. 


PROOF. The existence of f extending f_; = ly is immediate by applying 
the first part of Theorem 4.12 with r = —1. The hypotheses apply because X,, 
is projective forn > —1 and X’ is exact at X}, forn > —1. A similar argument 
shows the existence of g. 

If we have f and g, then gf : X — X and ly : X —> X are chain maps 
that extend the partial chain map given for n < —1 by ly forn = —1 and by 0 
forn < —2. Since again X,, is projective forn > —1 and X’ is exact at X/, for 
n > —1, the second part of the theorem shows that gf and 1y are homotopic. A 
similar argument shows that fg and 1x are homotopic. 


There is an analogous sequence of results that ends with resolutions that are 
cochain maps. They will be equally as useful as the above results when we 
introduce derived functors in the next section. For the results below, the notion 
of a projective is replaced by that of an injective. We say that a module / in the 
good category C is injective in C or is an injective in C if whenever a diagram in 
the category is given as in Figure 4.4 with g mapping one-one from B into C, 
then there exists 0 : B — I inC such that the diagram commutes. 


FIGURE 4.4. Defining property of an injective. 


196 IV. Homological Algebra 
We can think of the condition as saying that we can always extend such a t from 
B to C, the extension being o. In any event, we give some examples after proving 


an analog of Lemma 4.11. 


Lemma 4.14. If J is injective in the good category C and if the diagram 


—~ 


T eres 


y g 


A’ > A > A”, 


in C has kerg = image w and ty = 0, then there exists amapo : A” > J inC 
such that the diagram commutes. 


PROOF. The hypotheses force ker t D> image y = ker gy. Thus t : A > J and 
gy: A— A" descend to maps T : A/kerg > I and@: A/kerg > A”. If we 
put B = A/kerg and C = A”, then the above diagram leads to Figure 4.4 with 
T and @ in place of t and g. The hypothesis “injective” gives us o in Figure 4.4 
with T = o@, and the same o is the required map in the diagram above. 


EXAMPLES OF INJECTIVES. 


(1) If R is a field F and if C is the category of all vector spaces over F’, then 
every module is injective. In fact, in Figure 4.4 we write C = image g ® B’, and 
we let n : image gy — B be the inverse of y : B — image. Then we can define 
o to be 0 on B’ and to be T7 on image g. 


(2) Let C be the category of all abelian groups (unital Z modules). An abelian 
group G is said to be divisible if for each integer n ~ 0 and each x € G, there 
exists y € G with ny = x. Two examples of divisible abelian groups are the 
additive group of rationals and the additive group of rationals modulo 1. It is 
easy to see that any quotient of a divisible group is divisible and that direct sums 
of divisible groups are divisible. Let us see for abelian groups that injective is 
equivalent to divisible. 

The argument that injective implies divisible is easy: Let J be injective. Given 
x €landn £0, let B =C =Z, lett : Z > I have t(k) = kx, and let 
g:Z—> Zhave g(k) = kn. Setting up Figure 4.4, we obtainao : Z— I 
with t = og. If we put y = o (1) and evaluate both sides at 1, then we obtain 
x =T(1) =o (¢g(1)) = o(n) = no (1) = ny, as required. 

The argument that divisible implies injective uses Zorn’s Lemma. Let J be 
injective, and suppose that B, C, g, and t are given as in Figure 4.4. Consider 
the set S of abelian-group homomorphisms o’ having domain a subgroup of 
C containing g(B), having range 7, and having o’g = t. Order S by inclusion 
upward of the corresponding sets of ordered pairs. The set Sis nonempty because 


4. Projectives and Injectives 197 


the homomorphism o’ with domain g(B) and values o’(g(b)) = t(b) lies in 
S; 0’ is well defined because g is assumed one-one. Zorn’s Lemma yields a 
maximal element o in S, say with domain C. We show that C = C. Arguing 
by contradiction, suppose that C is a proper subgroup. Let c be in C but not C. 
The set of integers k with kc in C is an ideal in Z, and we let n be a generator. 
Since J is divisible, there exists an element a in J withna = o(nc). Define o on 
the subgroup generated by c and C by the formula a (kc + €) = ka + 0 (C) for 
k € Zandé € C. We need to check that & is well defined. Ifkc +E =k’c +2, 
then (k — k’)c = é —@ is in C, and thus k — k’ = qn for some integer gq. 
Hence o(ke + €) —G(k’c +0) = (kK —k)at+o(€—-e) =qnat+o(e-c)= 
qa (nc)+o(¢€—c’) = qa (nc) —o ((k—k’)c) = qa (nce) —ga (nc) = 0. Therefore 
o is anontrivial additive extension of o, in contradiction to maximality of o, and 
the proof is complete. 


(3) For R = Z, two good categories that were listed in Section 2 were the 
category of all finitely generated abelian groups and the category of all torsion 
abelian groups. With the first of these, Problem 1 at the end of the chapter asks 
for a verification that some module in the category fails to be a submodule of any 
injective. With the second of these, the injectives are the torsion divisible groups. 


The next proposition extends Example 2 and its proof to general R. Although 
the condition in the proposition is not very intuitive for general R, it has a simple 
interpretation for (commutative) principal ideal domains; see Problem 4 at the 
end of the chapter. 


Proposition 4.15. A unital left R module / is injective for the good category 
of all unital left R modules if and only if every R homomorphism of a left ideal 
J of R into J extends to an R homomorphism R — /. 


PROOF. The necessity is immediate from Figure 4.4 and the definition of 
“injective” if we take B = J,C = R and write t for the given R homomorphism 
of J into J. 

For the sufficiency, suppose that J and a diagram as in Figure 4.4 are given. 
Consider the set S of R module homomorphisms o’ having domain an R sub- 
module of C containing g(B) and having range J such that o’g = Tt, and 
order S by inclusion upward of the corresponding sets of ordered pairs. The 
set S is nonempty because the homomorphism o’ with domain g(B) and values 
o'(y(b)) = t(b) lies in S; o’ is well defined because g is assumed one-one. 
Zorn’s Lemma yields a maximal element o in S, say with domain C. We 
show that C = C. Arguing by contradiction, suppose that C is a proper R 
submodule of C. Let c be in C but not C. The set of elements r € R with rc 
in C is a left ideal J in R, and the mapping w(r) = o(rc) is a well-defined R 
homomorphism of J into J. By hypothesis, y extends to an R homomorphism 


198 IV. Homological Algebra 


W : R > I. Define & on the subgroup generated by c and C by the formula 
o(rc +é) = V(r) + o(e) forr € R andé € C. We need to check that & is 
well defined. If rc +¢ = r’c +, then (” —r’)c = @ — ZG is in C, and thus 
r —r’ isin J. Consequently V(r) — U(r!) = W(r —r’) =o((r —1r’)c). Hence 
o(rc+e)—a(r'c+c) = (W(r)—W(r')) +0 (€-@') = o ((r-r’)e) +a (€-’) = 
o((r —r’')c) —o((r —r’)c) = 0. Therefore & is a nontrivial extension of o, in 
contradiction to maximality of o, and the proof is complete. 


Now we can prove an analog of Theorem 4.12 for cochain complexes. This 
result had no counterpart in Chapter IIT. 


Theorem 4.16. Let X = {(Xn,d,)}"2_,, and X’ = {(X',,d/)}°° __. be 
cochain complexes in the good category C, and let r be an integer. Let 
(fn: Xn > Xj }n<r be a family of maps in C such that d’_ | fn-1 = fndn-1 
forn <r. If X is exact at each X, withn > r and X} is injective for n > r, then 
{fn : Xn > X!}n<r extends to a cochain map f : X —> X’, and f is unique up 
to homotopy. More precisely any two extensions are homotopic by a homotopy 
h such that h, = 0 forn <r. 


REMARKS. The diagrams in question are 


dn—2 dn-1 In dn+1 


d)-2 Xx! | xX! dn x! Gnas 
n-1 n n+l , 
for the construction of the cochain map and 
dn. dn dn+ 
-——> Xp-1 Nig ON Nig aa 


[ha A i? we [on Pe ee [tos 


z 
! ! ! n+l ! 
_— Xi, Se Se 


for the construction of the homotopy. 


PROOF. For the existence of the cochain map, it is enough by induction to 
construct f,+1. Matters are therefore as in the first of the above diagrams with 
n =r. Since X is exact at X, and X/_, , is injective, we are in the situation of 
Lemma 4.14 with J = X/,,, A” = X41, A= Xr, A = Xr, W = a-1, 
g =d,,andt =d) f,. The lemma gives amapo : A” > I withog = tT. 
If we take f-+; = o, then og = 7 says that f,41d, = d; f,, and the inductive 
construction of the cochain map is complete. 


4. Projectives and Injectives 199 


For the uniqueness up to homotopy, let f : X — X’ and g: X — X’ be 
two cochain maps such that f, = g, forn <r. Define h, : X, — Xj_, to 
be 0 for n <r + 1, and observe that the system of functions {h,},,<, satisfies 
hnaidn +d) _jhn = fn — 8n forn <r because f, = g, forn <r. Proceeding 
inductively, suppose that s > r and that h, has been constructed forn < s+1 such 
that An4idn +d) _jhn = fn — 8n forn < s. We are to construct hs42 : Xs42 > 
oe ay This is the situation of the second diagram with n = s. Since s > r, X 
is exact at X,4, and X/ 41 18 injective. Thus we are in the situation of Lemma 
4.14 with J = Xa A" = Xs542, A= X41, A = Xs, W = ds, 9 = dy 41, and 
T = (fs4t — 8541) — Gi hs41. The lemma gives amapo : A” > I withog =T. 
If we take hsi2 = o, thenog = 7 says that hs42ds41 = (fs41 — Bs41) — Gi Mssi, 
and the inductive construction of the homotopy is complete. 


A cochain complex X = (Xt <— M) that forms a resolution is called an 
injective resolution of M if every X, for n > 0 is an injective. 


Corollary 4.17. Let M be a module in a good category C and let 


e! 


X=(X+<—M) and xX =(X't <M 


be two injective resolutions of M. Then there exist cochain maps f : X —> X’ 
and g : X' > X with f_; = ly and g_; = ly, and any two such cochain 
maps f and g have the property that gf : X — X is homotopic to ly and 
fg: X' — X’' is homotopic to ly. 


PROOF. The existence of f extending f_; = ly is immediate by applying 
the first part of Theorem 4.16 with r = —1. The hypotheses apply because X 
is exact at X, forn > —1 and Xj, is injective form > —1. A similar argument 
shows the existence of g. 

If we have f and g, then gf : X — X and ly: X — X are cochain maps 
that extend the partial cochain map given forn < —1 by ly form = —1 and by0 
forn < —2. Since again X is exact at X, forn > —1 and X’, is injective for 
n > —1, the second part of the theorem shows that gf and 1y are homotopic. A 
similar argument shows that fg and 1x, are homotopic. 


We conclude with elementary characterizations of projectives and injectives 
that will turn out to be quite useful in the next two sections. We begin with a 
lemma® that will be useful now and will be helpful as motivation in the next 
section. 


The lemma is a slight variant of Problem 5 at the end of Chapter X of Basic Algebra. 


200 IV. Homological Algebra 


Lemma 4.18. Let C be a good category of unital left R modules, and let 


C23 As Be 30 


be an exact sequence in C. Then the following conditions are equivalent: 


(a) B isadirect sum B = B’ @ ker w of modules in C, 
(b) there exists an R homomorphism o : C — B such that wo = lc, 
(c) there exists an R homomorphism t : B — A such that tg = 1a. 


REMARK. When the equivalent conditions of this lemma are satisfied, one says 
that the exact sequence is split. 


PROOF. If (a) holds, then w | p 1s one-one from B’ onto C. Leto be its inverse. 
Then o : C > B’ is one-one with wo = Ic. So (b) holds. 

If (b) holds, then any b in B has the property that b—o y(b) has y(b—o w(b)) = 
w(b) — lcw(b) = 0 and is therefore in image g. Write b — oy (b) = g(a) for 
some a depending on b; a is unique because ¢ is one-one. If t : B — Ais defined 
by t(b) = a, then t is an R homomorphism by the uniqueness of a. Consider 
t(y(a)) fora in A. The element b = g(a) has b —owW(b) = g(a) —owg(a) = 
g(a) — o(0) = g(a), and the definition of t therefore says that t(g(a)) = a. 
Hence ty = 1a, and (c) holds. 

If (c) holds, then B’ = ker t is an R submodule of B. If b is in B’ N image g, 
then b = g(a) for some a € A and also 0 = t(b) = ty(a) = la(a) =a. So 
b = 0, and B’Nimage gy = 0. If b € B is given, write b = (b — gt(b)) + gt (b). 
Then gt (b) is certainly in image ¢, and t(b—gt(b)) = t(b)—1,4T(b) = O shows 
that b — yt(b) is in B’. Therefore B = B’ @ image g. Since image gy = kerw, 
we see that B = B’ @ ker wy and that (a) holds. 


Proposition 4.19. If C is a good category of unital left R modules, then 


(a) amodule P in Cis projective if and only if Homa(P, - ) isanexact functor 
from C into Cz, if and only if every exact sequence 


(ee a EL 7 Le gee 


in C splits when its third nonzero member C equals P, and 
(b) a module / in C is injective if and only if Homar(-, J) is an exact functor 
from C into Cz, if and only if every exact sequence 


g wv 


0—A-—->B >C > 0 


in C splits when its first nonzero member A equals /. 


4. Projectives and Injectives 201 


PROOF. For (a), suppose that P is given. The functor Hom,(P, -) is covariant 
and left exact, no matter what P is. Proposition 4.3 shows it is exact if and only if 
it carries short exact sequences into short exact sequences, and the left exactness 
means that the functor is exact if and only if it carries onto maps from B to C 
to onto maps from Homer(P, B) to Homer(P, C). If y : B > C is given, then 
Hom(1, %) : Homr(P, B) > Homar(P, C) operates onamapo inHom,(P, B) 
by Hom(1, ¥)(o) = wo. The statement that the equation yo = T is solvable 
for o for each t in Homr(P, C) whenever w is onto is precisely the statement 
that Figure 4.3 is solvable for o for all possible t’s whenever B —> C —> Ois 
exact, and thus P is projective if and only if the functor is exact. 

If P is projective and an exact sequence with C = P is given, take t = Ip 
in Figure 4.3. The projective property yields a map o : P > B with yo = |p, 
and Lemma 4.18b shows that the exact sequence splits. 

Conversely suppose that every short exact sequence with P as its third nonzero 
member splits. Suppose that a diagram as in Figure 4.3 is given with w :C > B 
onto and with t mapping P into B. Let S = C @ P, and let T be the R 
submodule {(c,x) € C @ P | wc) = t(x)} of S. Denote the projections 
of S to C and P by pc and pp, and let j : T — S be the inclusion. The 
map’ ppj carries T onto P; in fact, if x € P is given, then wy : C > B 
onto implies that there exists c, € C with w(c,) = t(x). Then (c,, x) lies in 
T, and ppj(cy,x) = pp(cx,x) = x. Consequently we have a 5-term exact 
sequence with terms 0, ker(ppj), T, P, 0, and this must split by hypothesis. 
Thus there exists amap g : P — T with ppjgq = |p. Defineo = pcjq. 
For x € P, jq(x) is some member of S' of the form (c, x) with w(c) = T(x). 
Hence wo(x) = Wpcjq(*) = wpc(c, x) = w(c) = T(x). Thus wo = T, and 
o : P — Cis the required map that exhibits P as projective. 

For (b), suppose that J is given. The functor Homa(-, /) is contravariant and 
left exact, no matter what J is. It is exact if and only if it carries one-one maps 
from A to B to onto maps from Hom,(B, /) to Hom,r(A, J). If g: A > Bis 
given, then Hom(g, 1) : Homr(B, J) — Homag(A, J) operates on a map o in 
Home(B, /) by Hom(¢, 1)(o) = og. The statement that the equation og = tT 
is solvable for o for each t in Home(A, J) whenever ¢ is one-one is precisely 
the statement that Figure 4.4 is solvable for o for all possible t’s whenever 
0 —> A —> B isexact, and thus / is injective if and only if the functor is exact. 

If J is injective and an exact sequence with A = / is given, take tT = 1, in 
Figure 4.4. The injective property yields a map o : B > I withog = 1,, and 
Lemma 4.18c shows that the exact sequence splits. 

Conversely suppose that every short exact sequence with / as its first nonzero 
member splits. Suppose that a diagram as in Figure 4.4 is given with : A > B 
one-one and with t mapping A into J. Let S = B@/, and let T be the quotient of 


7The pair (pc j, pp j) is called the pullback of (t, yr). See Problem 35 at the end of the chapter. 


202 IV. Homological Algebra 


S by the R submodule {(g(a), —t(a)) | a € A}. Denote the inclusions of B and J 
into S by ig andi;, andletk : S > T be the quotient mapping. The composition® 
ki; is one-one from J into T. In fact, if ki;(x) = 0 for some x ¢€ J, then (0, x) 
is amember of S of the form (g(a), —t(a)) for some a € A; thus g(a) = 0, and 
the fact that g is one-one implies that a = O and hence that x = —t(a) = 0. 
Consequently we have a 5-term exact sequence with terms 0, J, T, T/J, 0, and 
this must split by hypothesis. Thus there exists amapr : T > J withrki; = 1,. 
Define o = rkig. Fora € A,igg(a) —i;t(a) = (y(a), —t(a)) isin kerk. Thus 
kigg(a) = ki,;t(a), and og(a) = rkigg(a) = rki;t(a) = 1;t(a) = t(Q@) for 
a éA. Therefore og = t, ando : A — J is the required map that exhibits J as 
injective. 


5. Derived Functors 


Now we shall undertake the main construction of the chapter, that of “derived 
functors.” Let C be a good category of unital left R modules. Arranging for 
derived functors to be defined on every module in C requires that each module M 
in C have either a projective resolution or an injective resolution, and thus C must 
have either many projectives or many injectives in a suitable sense. Let us make 
the condition precise. 

We say that C has enough projectives if every module in C is a quotient of a 
projective in C. Suppose that this condition is satisfied. Let M be a module in C, 
and let Xo be a projective that maps onto M, say by a map ¢«. Then kere is in C, 
since good categories are closed under the passage to submodules, and we let X, 
be a projective in C that maps onto kere, say by a map dy. Similarly let Xz be a 
projective that maps onto ker dj in X,, say by a map 90, and so on. The result is 
that we obtain a projective resolution of the form X+ —» M with X+ given by 


XP ES > Xo ue Le a 


Consequently the condition “enough projectives” implies that every module in C 
has a projective resolution in C. 

Similarly we say that C has enough injectives if every module in C is a 
submodule of an injective in C. Suppose that this condition is satisfied. Let 
M be a module in C, and let Xo be an injective into which M embeds, say by 
amap e. Then X0/imagee is in C, since good categories are closed under the 
passage to quotient modules, and we let X; be an injective into which X0/ image € 
embeds, say by a map di. Let do be the composition of the quotient map from Xo 
to Xo/ image ¢, followed by d*; then do maps Xo into X; with kerdy = image e. 


8The pair (kig, ki;) is called the pushout of (t, g). See Problem 35 at the end of the chapter. 


5. Derived Functors 203 


We let X> be an injective into which X  / image dy embeds, say by d*, and we let 
d, be the composition of the quotient map from X, to X;/ image do, followed by 
di; then d; maps X, into X» with kerd, = image dp. Continuing in this way, we 
obtain an injective resolution of the form X* <— M with Xt given by 


d 
KT epee SS 


Ky. 


Consequently the condition “enough injectives” implies that every module in C 
has an injective resolution in C. 

The category Cr of all unital left R modules certainly has enough projectives. 
In fact, every module in Cp is the quotient of a free R module, and free R modules 
are projective in Cr. It is less trivial but still true that Cr has enough injectives. 
Let us pause for a moment to prove this result in Proposition 4.20 below. 

As is shown in Problems 1-2 at the end of the chapter, other good categories 
of unital left R modules may or may not have enough projectives or enough 
injectives, and a good category may have the one without the other. 


Proposition 4.20. If R is any ring with identity, then the category of all unital 
left R modules has enough injectives. 


PROOF. We treat first the case that R = Z. In view of Example 2 of injectives, 
we are to exhibit an arbitrary abelian group A as isomorphic to a subgroup of a 
divisible group. We know that A is isomorphic to a quotient of some free abelian 
group. Write A = F/S with F a direct sum of copies of Z and S equal to some 
subgroup of F'. Taking a Z basis for F and forming a Q vector space with that 
same basis, we can regard F as a subgroup of the additive group D of a rational 
vector space. The group D is divisible, and A is isomorphic to a subgroup of 
D/S. Any quotient of a divisible group is divisible, and thus D/S is divisible. 

Now we allow R to be any ring with identity. We shall make use of various 
results from Chapter X of Basic Algebra. If M is any unital left R module, let us 
denote by FM the underlying abelian group’ of M. If we regard R as an (Z, R) 
bimodule, then Proposition 10.17 makes Homz(R, FM) into a left R module, 
withrg(r’) = g(r'r) forr andr’ in R. The mapping m b> 9g, with g(r) = rm 
is a one-one R homomorphism of M into Homz(R, 7M). From the previous 
paragraph we can find a divisible abelian group with FM C D, and we can then 
regard the left R module Homz(R, FM) as an R submodule of Homz(R, D). 
Consequently we can regard M as an R submodule of Homz(R, D). We are 
going to prove that 7 = Homz(R, D) is injective in Ce. 

We digress for a moment to make a side calculation. With D fixed and N equal 
to any unital left R module, we make use of the isomorphism 


Homr(N, Homz(R, D)) = Homz(R @r N, D) 


°F is called the forgetful functor from Cpr to Cz. 


204 IV. Homological Algebra 


given in Proposition 10.23 of Basic Algebra; in the expression R @ x N, the left 
factor of R is to be regarded as a right R module (and not also a left R module), 
and then R @p N is really F(R ®p N) in the sense that the tensor product retains 
only the structure of an abelian group. Meanwhile, Corollary 10.19a gives us 


Homz(R @r N, D) = Homz(N, D); 


here the R on the left is an (R, R) bimodule, and the isomorphism is one of left 
R modules. However, there is no harm in applying F to both sides and obtaining 


Homz(F(R @p N, DY) X Homz(FN, D). 


Thus 
Home(N, Homz(R, D)) = Homz(FN, D). («) 


If we track down the isomorphisms in the results of Chapter X, we see that 
the map from left to right sends g € Homr(N,Homz(R, D)) to the map 
® € Homz(FN, D) with ®(x) = ¢g(x)(1) for x € N, and the inverse sends 
® to g with g(x)(r) = Brn). 

Now we return to ] = Homz(R, D). By Proposition 4.19b, J will be injective 
if and only if Homa(-, 7) is an exact functor. Since this functor is contravariant 


and left exact, it is enough to prove that if 0 —> A , Bis exact in Cr, then 


Homa(B, 1) 2". Home(A, I) —> 0 (x) 


is exact in Cz. Let us reinterpret (**) in the light of the isomorphism (*) when 
N = Band N = A. If g is in Homa(B, Homz(R, D)), then Hom(y, 1)(@) 
is the member gy of Homr(A, Homz(R, D)). The corresponding members of 
Homz(7B, D) and Homz(FA, D) are ® with ®(b) = g(b)(1) and a member 
®’ of Homz(FA, D) with ®'(a) = gw(a)(1). Thus &’ = ®(Fw), and the map- 
ping Hom(y, 1) in (+) translates under the isomorphisms (+) into the mapping 
Hom(Fy, 1) of Homz(FB, D) into Homz(FA, D). The group D is divisible, 
hence injective in Cz. Since Fy : FA — FB is one-one and D is injective 
in Cz, Proposition 4.19b shows that Hom(Fy, 1) carries Homz(FB, D) onto 
Homz(FA, D). Therefore (**) is exact, and we conclude that J is injective 
in Cr. 


Derived functors of an additive functor F from one good category to another 
will be useful when F is left exact or right exact, and there will be one derived 
functor for each integer n > 0. The value of the n™ derived functor on a module 
M is obtained by taking a projective or injective resolution of M according to 
the rule in Figure 4.5, applying F to the resolution, dropping the term F(M) 


5. Derived Functors 205 


that occurs in degree —1, and forming the n homology or cohomology of the 
resulting complex. The full traditional notation for the derived functor in question 
appears in Figure 4.5, along with an abbreviated notation that we shall tend to 
use. 

The choice of projective or injective resolution at the start is made in such a 
way that the 0" derived functor is naturally isomorphic to F; this condition will 
be clarified in Proposition 4.21 below. If a projective resolution is to be used, 
one makes the assumption that the domain category has enough projectives; if 
an injective resolution is to be used, one makes the assumption that the domain 
category has enough injectives. 

If the resulting complex obtained by applying F to the resolution is a chain 
complex, the abbreviated notation is F,, for the n® derived functor; otherwise it 
is Ff". The full traditional notation involves using an L or R in front of F to 
denote the one-sided exactness, left or right, that F is not assumed to have, and 
the subscript or superscript n is moved from F to the L or R. 


Exactness —variant Resolution —ology Notation Example 
right co— projective hom— F,, LyF M @r(-) 
right contra— injective hom— F,, LyrF M ®z Homz(.-, J), 
I injective 
left co— injective cohom— | F”, R"F Homr(M, -) 
left contra— projective cohom— | F”, R"F Homr(-, M) 


FIGURE 4.5. Formation of derived functors. 


There are several things that need elaboration in this definition, and we take 
them up right away. 

First there is the fact that F,,(M) or F”(M) is well defined. Suppose that we 
start with two resolutions X and X’ of M (projective or injective by the rules in 
Figure 4.5). Corollary 4.13 or 4.17 gives us chain or cochain maps f : X > X’ 
and g : X' > X with f_; = ly and g_; = ly and shows that gf : X > X is 
homotopic to 1x and that fg : X’ > X’ is homotopic to ly’. For definiteness 
let us suppose that F is covariant and right exact; then chain maps are involved 
and the derived functors of F are to be denoted by F,,. Applying F to our chain 
maps, we obtain chain maps F(f) : F(X) > F(X’), F(g): F(X) > F(X), 
F(gf): F(X) > F(X), and F(fg) : F(X’) > F(X’). The last two of these 
are homotopic to lp(xy) : F(X) > F(X) and to lpyyy : F(X’) > F(X’), 
respectively, by F of the respective homotopies. Proposition 4.1 shows that 
F(g)F(f) = F(gf) induces the identity on H,(F(X)) and that F(f)F(g) = 
F (fg) induces the identity on H,,(F (X')). Consequently the mappings induced 


206 IV. Homological Algebra 


on homology by F(f) and F(g) are two-sided inverses of one another. Thus 
F,,(M) as computed from X is isomorphic to F,,(M) as computed from X’. 

Moreover, this isomorphism is canonical. If f’ : X — X’ is another chain 
map, then the same calculation shows that F(f’) and F(g) induce two-sided 
inverses of each other on homology, and hence F(f) = F(f’) on homology. 
Thus F,,(M) is well defined up to canonical isomorphism when F is covariant 
and right exact. The other three situations in Figure 4.5 are handled in similar 
fashion and lead to analogous conclusions. 

Next we make F,, or F” into a functor. To do do, let : M — M’ be given. For 
definiteness, again let us suppose that F is covariant and right exact. Let X and X’ 
be projective resolutions of M and M’, respectively, and apply Theorem 4.12 to 
produce a chain map ® : X > X’ with ®_; = g. Then F(®) : F(X) > F(X’) 
is a chain map and induces maps on homology that we denote by F,,(g). Here 
F,,(g) maps F,(M) into F,(M’). 

Let us see that F,,(y) is well defined. If X is replaced by X, Corollary 4.13 
produces chain maps f : X > Xandg: X — X with f_| = 1yandg_, = 1y, 
and Theorem 4.12 produces a chain map ® : X > X’ with ®_,; = g. Since Bo f 
and ® are both chain maps from X to X’ that equal g in degree —1, Theorem 
4.12 shows that ® o f is homotopic to &. Similarly ® o g and ® are chain 
maps from X to X’ and are homotopic. By Proposition 4.1, F(® o f) = F(®) 
on homology, and F(® o g) = F(®) on homology. Thus on homology F(®) 
corresponds to F(®) under the canonical isomorphism F'(f), whose inverse on 
homology is F(g). In short, F,(@) is well defined up to the previously obtained 
canonical isomorphisms. The other three situations in Figure 4.5 are handled in 
similar fashion and lead to analogous conclusions. 

Tracing through the definition of how derived functors affect maps, we see 
that the map 1 goes to the map | and that compositions go to compositions, in 
the same order as for F. Thus the derived functors are indeed functors. The 
derived functors of a covariant functor are covariant, and the derived functors of 
a contravariant functor are contravariant. 

We need to check that the derived functors are additive. If g : M — M’ and 
yg’: M > Mare given, then we can proceed as above and use a single resolution 
of M and a single resolution of M’ to investigate vy, y’, and g + gy’. Then it 
is apparent that the chain or cochain maps built from maps of M to M’ add in 
the same way as the maps, and the result is that each F,, or F” is additive with 
particular choices of the resolutions in place. Allowing the resolutions to vary 
means that we have to take canonical isomorphisms into account, and after doing 
so, we still get additivity. 

If two functors F and G fromC to C’ of the same type in Figure 4.5 are naturally 
isomorphic, then F,, and G,, (or else F” and G”) are naturally isomorphic for all 
n. In fact, if T is the natural isomorphism, then T associates a member 74 


5. Derived Functors 207 


of Hom(F (A), G(A)) to each module A in C. Take a projective or injective 
resolution X = {X,,} of A, as appropriate, and form the two complexes F(X) and 
G(X). The system {Ty,} is then a chain map from F(X) to G(X), with inverse 
{Ty ') and the homology or cohomology of F(X) is exhibited as isomorphic to 
the homology or cohomology of G(X). This much shows that F,,(A) = G,(A) 
(or F"(A) = G"(A)) for all n. We omit the details of verifying the naturality of 
this isomorphism in the A variable for each n. 


Proposition 4.21. In the four situations of derived functors in Figure 4.5, under 
the assumption that the domain category for F has enough projectives or enough 
injectives as appropriate, the 0" derived functor of F is naturally isomorphic to F. 


PROOF IF F IS COVARIANT AND RIGHT EXACT. Let 


X| ue A “> M > 0 


be the terms in degree 1, 0, —1, —2 ofa projective resolution of M. By Proposition 
4.5 and its remark, the right exactness and covariance of F imply that 


F(X) 2&3 F(X») ~&> F(M) — 0 


is exact. The derived-functor module Fo(M) is computed as the om homology of 


Bae FP ee 8: 


Thus 
Fo(M) = F(Xo0)/ image F (09) = F(Xo0)/ ker F(e). 


Since F(€) is onto F(M), the right side here is = F(M) via F(e). 
This establishes the isomorphism. Let us prove that it is natural in the variable 
M. If gy: M — M' is given, we are to prove that the diagram 


Fo(M) 2? ©, Fay 


Fo(y) [rw (*) 


via F(e’) 
—_—> 


Fo(M") F(M’) 


commutes. Using Theorem 4.12, we form the part of a chain map that is indicated: 


> Xo > M > 0 


208 IV. Homological Algebra 


Application of F gives a commutative diagram 


F(Xo) —2> F(M) 


ro ro| 


F(xi) ~©s F(M’) 


and this becomes (*) upon passage to the quotients F(Xo)/ker F(e) and 
F (X6)/ker F(e’). This completes the proof. 


EXAMPLES. 


(1) The invariants functor F(M) = M® for a group G. Suppose that a group 
G acts on an abelian group M by automorphisms. This situation is completely 
equivalent to considering M as a unital left ZG module, where ZG is the integer 
group ring of G. The subgroup of invariants of M is 


M° ={m EM | gm =m forall g € G}. 


The formulas F(M) = M® for such a module M and F(h) = I ee for h in 
Homzg(M, M’) define a covariant additive functor called the invariants functor; 
we can think of F as carrying Czg into itself, but it is preferable to think of it as 
carrying Czg into the category Cz of abelian groups. The functor F is naturally 
isomorphic to the functor H = Homzac(Z, -), where Z is made into a ZG 
module with trivial G action; as with F, we consider H as a functor from CzG 
to Cz. To see the isomorphism, we associate to each module M the abelian- 
group homomorphism Ty : M° — Homz(Z, M) defined by Ty(m) = Gm with 
Ym (k) = m forallk € Z. Ifh isin Homzg(M, M’), then the two maps Ty 0 F (h) 
and H(h) o Ty of F(M) into H(M’) are equal, since at each m € M® we have 


H(h)Ty(m) = H(h) (Gm) = Hom(1, h) (Gm) = hm = Prim) = Tur F (h)(m). 


This identity means that {Ty} is a natural transformation; we readily check for 
each M that Ty carries M© one-one onto Homz(Z, M), and thus {Ty} is anatural 
isomorphism. 

Because of this natural isomorphism, the invariants functor is covariant and left 
exact. Its derived functors F” or H” are obtained by using an injective resolution 
I —M <0, applying the functor (-)°% or Homzg(Z, -), dropping the term in 
degree —1, and forming cohomology. Briefly 


F"(M) = H"(I°) = H"(Homz¢(Z, 1)) 


for an injective resolution J < M < 0. 


5. Derived Functors 209 


It turns out that the result is given also by the cohomology-of-groups functors 
H"(G, M) even though this was not the procedure by which we obtained group 
cohomology in Section III.5. In fact, what Section III.5 said to do was to start 
from a free resolution (a projective resolution would have been good enough) 
such as P —> M —> 0 of Z in Cz, apply the contravariant left exact functor 
Homzc(-, M), drop the term in degree —1, and form cohomology. Briefly then, 
Section III.5 said that 


H"(G, M) = H"(Homzc(P, M)) for a projective resolution P > Z > 0. 


The fact that H”(G, M) can be computed in either of these ways is not particularly 
obvious from what we have done so far, but it will be a special case of the natural 
isomorphism of functors Ext” and ext” that is proved as Theorem 4.3 | in Section 7. 
With either formula for H"”(G, M), we obtain H°(G, M) = M® in agreement 
with Proposition 4.21. 


(2) The co-invariants functor F(M) = Mg fora group G. In the same setting 
as in Example 1, the subgroup of co-invariants of M is 


Mc = M / (subgroup generated by all gm — m for g ¢ G, me M). 


The functor F can be seen to be naturally isomorphic to the functor H with 
H(M) = Z®@zc M. Itis therefore covariant and right exact. Its derived functors 
are given by 


F,(M) = A, (Pc) = Hn(Z®zeg P) fora projective resolution P ~ M —> 0. 


These are by definition the homology-of-groups functors H,(G, M). Although 
the equality is not particularly obvious, H,(G, M) can be computed also from 


H,,(G, M) = H,(P ®zg M) fora projective resolution P > Z — 0. 


This isomorphism is a special case of the natural isomorphism of functors Tor, 
and tor, that is mentioned just before Proposition 4.29 in Section 7; the proof 
is completely analogous to the proof of Theorem 4.31. With either formula for 
H,,(G, M), we obtain Hp(G, M) = Mg in agreement with Proposition 4.21. 


(3) Derived functors with R = Z. For the ring Z and the category Cz (or more 
generally for Cr for any principal ideal domain R), projective resolutions and 
injective resolutions can be fairly short, and derived functors in degree > 2 are 
all 0. Let M be a given unital Z module, ie., an abelian group. We know that 
M is the quotient of some free abelian group Xo, say with a quotient map ¢, and 
then X, = kere is a subgroup of a free abelian group and hence is free abelian. 
Thus a projective resolution of M is 


inc 


Op XG SO SO, 


210 IV. Homological Algebra 


The kinds of derived functors that make use of projective resolutions are the 
covariant right exact ones and the contravariant left exact ones. If F is such a 
functor, then we are led to the complexes 


6s PK Se 0 


and Oe PO re 


in the two cases. Thus the values of the derived functors are Fo(M) = M and 
F\(M) = ker F(e) in the first case, and F°(M) = M and F!(M) = coker F(e) 
in the second case. Higher derived functors are 0. Similar remarks apply to 
injective resolutions and the remaining two cases for derived functors in Figure 
4.5. Every abelian group is a subgroup of a divisible group, which is injective in 
Cz, and the quotient of the divisible group by the given abelian group is divisible, 
hence injective. Thus we can arrange for all terms of an injective resolution to 
be 0 beyond the X, term, and an analysis of the results similar to the one above 
is possible. 


6. Long Exact Sequences of Derived Functors 


The first four theorems of this section say that a short exact sequence of modules 
leads to a long exact sequence of derived functor modules and that it does so in 
a functorial way. Let us suppose that F : C — C’ is an additive functor between 
good categories. For the first of the theorems, suppose further that C has enough 
projectives and that F is one of the types of functors in Figure 4.5 making use of 
projective resolutions in the definition of its derived functors. The last of these 
conditions means that F is to be covariant right exact or contravariant left exact. 

To prove such a theorem, we shall want to apply Theorem 4.7, which produces 
a long exact sequence from a short exact sequence of complexes. To each of the 
modules in the given short exact sequence, we attach a projective resolution. If 
these projective resolutions can somehow be related by chain maps so as to give 
a short exact sequence of projectives in each degree, then we can apply F to the 
entire diagram, invoke Theorem 4.7, and obtain the desired long exact sequence. 
Application of Theorem 4.10, in combination with some further checking, will 
show that the passage from the given short exact sequence of modules to the long 
exact sequence of derived functor modules is functorial in the modules of the 
short exact sequence. 

Thus the problem is to obtain the compatible projective resolutions. Propo- 
sition 4.19a gives us a clue about what to look for: any short exact sequence of 
projectives has to be split. Here is the statement of the first theorem. 


6. Long Exact Sequences of Derived Functors 211 


Theorem 4.22. Let F : C — C’ be an additive functor between two good 
categories. Suppose that F either is covariant right exact or is contravariant 
left exact, and suppose that C has enough projectives. Whenever there are three 
modules and two maps in C forming a short exact sequence 


g vy 


0o— A— B—-C— 2), 


then the derived functors of F on the three modules form a long exact sequence 
in C’ as follows: 


(a) If F is covariant and right exact, then the long exact sequence is 
0 <— F(C) <— F(B) <— F(A) <— F,\(C) <— F\(B) <— F(A) 
<— Fy(C) << Fy(B) <— F(A) <— F3(C) <— -::-. 


(b) If F is contravariant and left exact, then the long exact sequence is 
0 —> F(C) —> F(B) — F(A) — F'(C) — F'(B) — F(A) 
aS RA Ch = FAB OA SS oe See 


We begin with a lemma. 


Lemma 4.23. In the good category C, suppose that the diagram 


0 0 0 
0 < 7 ge Pa ge My 0 
Q i, o| 
vv i v ts Vv 
0 <— B <--> PR@Po <7 My 32355 0 
yv Pc wi 
v Vv Vv 
0 < og Po 2.9 "2 4 
0 0 0 


has the first two columns and the two rows with solid arrows exact and has P, 
and Pc projective. Here i, is the inclusion into the first component of P4 @ Pc, 
and pc is the projection onto the second component. Then there exist a module 
Mz and maps €g, Wp, ¢1, and y such that the whole diagram, including the 
dashed arrows, has exact rows and columns and has all squares commuting. 


212 IV. Homological Algebra 


PRooF. The module P, @ Pc is in C because C is good, and it is easy to see 
that P4 ® Pc is projective. Let us define ¢g. Since Pc is projective, there exists 
h: Po — B such that wh = €c, and we put €g(x4, xc) = ve axa, +hxc. Then 
the equation 

QEAXA = EB(XA, 0) = EBiaxa 


says that the upper left square commutes, and the equation 
Wea(xa,Xc) = Weeaxa + Whxc = 0+ Ecxc = Ec Pc(Ka, XC) 


says that the lower left square commutes. 

To see that ¢g is onto B, let b € B be given. Since pc and &¢ are onto, 
so is €cpc = Weg. Thus we can choose (x4, Xc) in P4 @ Pc with w(b) = 
wWeép(x4,Xc). Hence b — €g(Xq, Xc) lies in ker y = image ¢g, and we can write 


b — €p(Xa, XC) = 9a) = eA(X 4) = Epial(xy) = EB (2X4, 0) 


for some x’, € P4. Then b = eg(x4 + X44, Xc), and &@ is onto. 
Let Mg = kerég, and let Wz : Mg — Pa, ® Pc be the inclusion. For m, in 
Ma, let 91 (m4) = (Wam a, 0). Then gj; (ma) is in Mg because 


ep(Wama, 0) = Geavama +h0 = 90+ h0=0. 


Moreover, this definition of g; makes the upper right square commute. 

To define w 1, let (x4, xc) be in Mg, so that eg(x4,xc) = 0. Then O = 
Wep(Xa, XC) = EcPc(Xa, Xc) = Ec(Xc), Xc lies in kereg = image Wc, and 
xc = Wc(mc) for a unique mc in Mc. We put 1 (x4, xc) = mc. Then the 
equation 


WoW (Xa, XC) = We(mc) = Xc = Pc(Xa, XC) = PcoWB(Xa, XC) 


shows that the lower right square commutes. 

Now all the squares commute, and all the rows and columns are exact except 
possibly the third column. Corollary 4.8 allows us to conclude that the third 
column is exact, and the proof of the lemma is complete. 


PROOF OF THEOREM 4.22. The main step is to construct projective resolutions 
of A, B, and C by an inductive process in such a way that the three resolutions to- 
gether form an exact sequence of chain complexes. We start by forming projective 
resolutions 


A 0 1 


ie. ee 


Yo v1 
Zi < 


and Ore ee Zo < 


6. Long Exact Sequences of Derived Functors 213 


Replacing X; by M4 = kerag and Z,; by Mc = kery, we are led to the 
starting diagram in Lemma 4.23. Application of the lemma produces a short 
exact sequence 


(a= Bie eo, So ig 0 


and the vertical maps g; and yw; that make the squares commute in the lemma. 
Next we move everything one step to the right, applying the lemma to a diagram 
as in the lemma with first and third rows 


a inc 
0 <— kere, <~ X) <— keragn <— 0 
Yo inc 
and 0 <— kerec <— Z; <— ker yo <— 0 


and with an exact sequence in the first column involving the maps ¢ and yy. 
Application of the lemma produces a short exact sequence 


O <— kereg P&Z) << ker Bo <— 0 


and the vertical maps @2 and wy that make the squares commute in the lemma. 
We can put these steps together to form the following diagram with exact rows 
and columns and with commuting squares: 


0) 0) 0 0) 

0 < A 2 Xo on XxX os kera,; <—— 0 
g ix ix, $2 

eB Bo inc 

0 < B « Xo @Zy < X, OZ; < kerf; <— 0 
y PZ PZ, Wr 

0 < Ce Zo rs Z a kery) <—— 0 
0) 0) 0 0) 


We can repeat the use of Lemma 4.23, starting from the last column of the above 
diagram and more of the projective resolutions of A and C, and then we can merge 
the new result with the diagram above to obtain a diagram with one additional 
column. Continuing in this way, we arrive at three projective resolutions and 
vertical maps that together form an exact sequence of chain complexes. 


214 IV. Homological Algebra 


To obtain a long exact sequence for our derived functors, we apply the functor 
F to the final diagram above, except that we drop the left column of 0’s and the 
column containing A, B,C. After the application of F,, the remaining columns 
are still exact because the columns in C are split and because F' sends split 
exact sequences to split exact sequences.'° Then we apply Theorem 4.7, taking 
Proposition 4.21 into account, and the long exact sequence results except for the 
one detail of the 0 at the end. In other words, we still have to prove exactness 
at F(C). But exactness at this point is immediate from the assumed one-sided 
exactness of F. This completes the proof. 


Before addressing the functoriality of the association in Theorem 4.22, let us 
record the corresponding result when the derived functor makes use of injective 
resolutions. 


Theorem 4.24. Let F : C — C’ be an additive functor between two good 
categories. Suppose that F either is contravariant right exact or is covariant 
left exact, and suppose that C has enough injectives. Whenever there are three 
modules and two maps in C forming a short exact sequence 


G-spot Ge: 


then the derived functors of F on the three modules form a long exact sequence 
in C’ as follows: 


(a) If F is contravariant and right exact, then the long exact sequence is 


0 <— F(A) <— F(B) <— F(C) <— F\(A) <— Fi(B) <— Fi(C) 


4— IQ (A) <— (8) <— (CC) — A) 
(b) If F is covariant and left exact, then the long exact sequence is 
0 —> F(A) — F(B) — F(C) — F(A) — F!(B) — F'(C) 
SP Prey Seo S Pra Se 


PROOF. The necessary modifications to the proof of Theorem 4.22 are fairly 
straightforward, but some comments are in order concerning how Lemma 4.23 is 
to be modified. In the diagram in the statement of Lemma 4.23, all the horizontal 
arrows are to be reversed, the projectives P, and Pc are to be replaced by injectives 


104 split exact sequence is the union of two four-term exact sequences from each end, and F is 
exact on each of these. In addition, we saw in Section 2 that F respects direct sums. It follows that 
F carries split exact sequences to split exact sequences. 


6. Long Exact Sequences of Derived Functors 215 


I, and Ic, and My, and M¢ are the quotients M4 = I,4/e,4(A) and Mc = 
Ic/ec(C). Let us define eg. Since J, is injective, choose h : B > I, with 
hg = &4, and put €g(b) = (h(b), ecw(b)). Then the equation 


€py(a) = (hea, ecyga) = (€4(a), 0) = igea(a) 


says that the upper left square commutes, and the equation 
ecW(b) = pch(b), ecw (b)) = pcés(b) 


says that the lower left square commutes. 

To see that €g is one-one, let eg(b) = 0. Then 0 = pceg(b) = Ec (db). 
Since €¢ is one-one, y(b) = 0, b lies in ker y = image g, and b = g(a). Then 
O = eg(b) = egy(a) = igé,(a), and a = O because i, and €, are one-one. 
Hence b = g(a) = 0, and €z is one-one. 

Let Mg = (4 @ Ic)/€ p(B), and let wg : I, @ Ic — Mz be the quotient 
map. To define g1, we let gi(@m4) = Wa(xa,0) if m4 = Wax, with x4 € Ia. 
If x’, is another preimage of m4 under Wai then x’, — x4 = €,4(a) for some 
a € A, and Wa(xa,0) — Wa (x,90) = Waiaeaa) = Wrepy(a) = 0; hence 
gy, is well defined. Since Wgiaxg = We(x4,0) = Qima = Qi Waxa, the 
upper right square commutes. To define w, let mg € Mg be Wa(xa, xc), and 
define y¥i(mg) = We(xc). If (x4, xG) is another preimage of mg under Was 
then (x/,,x¢) — (ta, Xc) = €p(b) for some b € B, and We(xG) — We(xc) = 
We Pc(x4,X¢) — We pc(*a, Xc) = Wepces(b) = Wcecw(b) = 0; hence y is 
well defined. Since Wc pc (Xa, Xc) = Wc(Xc) = Wilms) = Wie (Xa, xc), the 
lower right square commutes. 

Now all the squares commute, and all the rows and columns are exact except 
possibly the third column. Corollary 4.8 allows us to conclude that the third 
column is exact, and the proof of the analog of Lemma 4.23 for injectives is 
complete. Theorem 4.24 then follows routinely. 


Theorem 4.25. Let F : C — C’ be an additive functor between two good 
categories. Suppose that F either is covariant right exact or is contravariant left 
exact, and suppose that C has enough projectives. Then the passage as in Theorem 
4.22 from short exact sequences in C to long exact sequences of derived functor 
modules in C’ is functorial in the following sense: whenever 


0 ee ee > 0 


trl fo| fe | 


0 > A 


216 IV. Homological Algebra 


is a diagram in C with exact rows and commuting squares, then the long exact 


sequences of derived functors of F on A, B, C and A, B, C make commutative 
squares with the maps induced by the derived functors on f4, fz, fc. 


PROOF. The proof of Theorem 4.22 involved constructing a diagram 


0 0 0 0 
0 <— A <* Xo as XI = X2 <—. 


e| ix | ix | i | 
v i va | pai| pas] 


Yo Yl 


0 <— C <— Zo Z\ Z2 <— 
i: | 
0 0 0 0 


with exact rows and commuting squares in which each X,, and Z,, is projective, 
and a similar diagram corresponds to the given short exact sequence with tildes 
on it. The present theorem will follow from the functoriality in Theorem 4.10 
if we can arrange that these two diagrams can be embedded in a 3-dimensional 
diagram with each of these diagrams in a horizontal plane and with vertical maps 
from the one diagram to the other such that all vertical squares commute. 

We are given vertical maps f4, fg, and fc, which we can regard as extending 
from the diagram with tildes to the other diagram. In addition, Theorem 4.12 
gives us chain maps { fx,} and {fz,} with fy_, = fa and fx_, = fc, and all the 
completed vertical squares in the 3-dimensional diagram commute. To complete 
the proof, we construct by induction forn > Oamap fn: Xn ® Zn > Xn OB Zn 
such that 


rh ine. tige wt Pas taba: 


with the understanding that 6B_; = eg. To make it possible for the inductive step 
to include the starting step of the induction, let us write X_; = A, Z_; = B, 
ix. = 9%, pz, = W, a1 = &4, y-1 = &c, and f_; = fp. Also, let us 


understand any module or map with subscript —2 to be 0. 


6. Long Exact Sequences of Derived Functors 217 
We shall construct f,. For Z € Zone we apply pz,_, to the difference 
Bn-1 00, fz,2) nad tn—1Bn_-1 0, Z) and get 
P2,-1Bn—10, f2,2) — PZ, fn—1Bn—10, 2) 
= Yn-1P2, 0, 2,2) — fey. PZ,_,Bn—10.) 
= Yn—1f2,2 — fens ¥n—1 7, 0, Z) 
= f2,1¥n-12 — f2,.Yn-1Z = 9. 


Thus £,—1 (0, F220) = fr—1Bn—1(0, 2) = ix, _,(x) fora unique x € X,_1, and we 
define t : Z, — X, 1 by saying that t(Z) should be this x. This makes 


ix,.7@) = Br1O, f2,2) — fn—1B 10, 2). 
Setting up the diagram 


Zn 
LE “og 
pay 
An-2 An-1 
Xn-2 <——._ Xy-1 <—- X), 


we prepare to invoke Lemma 4.11. We have 


ix, »Qn—2T (Z) = Bn—2ix,_,T@) = Bn—2Bn-10, fz,Z) — Bn—2.fn—1Bn—1 0, 2) 
= 0— fa-2Bn—2Bn—1(0,Z) = 0. 
Since ix,_, 18 one-one, @,-2T = 0, and Lemma 4.11 applies. Thus we obtain 
o:Z, —> X, witha,_jo = Tt, and o satisfies 
ix, 1010 @) = Bri, fz,2) — fn—1Bn—10, 2).- (+) 
Define 
fn(X, 2) = (fx,@) — 0 @), fz,@). (t) 


With f,, defined, we are to prove the three formulas («). For the first formula 
in (*), we apply pz, to both sides of (+) and obtain pz, f,(x,z) = fz,@) = 
fz, Bz, (%, 2), which is the desired formula. The second formula in (+) at x is just 
(+) with Z = 0. 

We are left with proving the third formula in («). Using the second formula in 
(*), we have 


Prat fil; 0) = Bn-1 fnix, (x) — Bn—11x, fx, (x) 
= ix, _,On-1 fx, (x) ae Sn be Penne 8) 
= Sie) = pape) 


= fr—1Bn—1(%, 0). GaP) 


218 IV. Homological Algebra 


Also, 


Bn-1 fn, 2) = —Bn—1ix,0(@Z) + Bn—1 (0, fz, (z)) by (+) 
= —ix, ,n-10(Z) + Br_1 00, fz, Z)) by commutativity 


= fy—1Bn-10, 2) by (#x). 


Adding this equality and (++), we obtain the third formula of (*). This completes 
the proof. 


The version of Theorem 4.25 appropriate for Theorem 4.24 is the following, 
and its proof is similar. 


Theorem 4.26. Let F : C — C’ be an additive functor between two good 
categories. Suppose that F either is contravariant right exact or is covariant left 
exact, and suppose that C has enough injectives. Then the passage as in Theorem 
4.24 from short exact sequences in C to long exact sequences of derived functor 
modules in C’ is functorial in the following sense: whenever 


Spas C > 0 


(ee eee 


nl al el 


ee ee ae eee ee | 


is a diagram in C with exact rows and commuting squares, then the long exact 
sequences of derived functors of F on A, B, C and A, B, C make commutative 
squares with the maps induced by the derived functors on f4, fz, fc. 


We come to an important application of the long exact sequences in Theorems 
4.22 and 4.24. Projective and injective resolutions make it easy to work with de- 
rived functors theoretically, but in practice any computations with them are likely 
to be difficult. It is therefore convenient to be able to compute derived functors 
from other resolutions than projective and injective ones.'! For definiteness let 
us work with the case of a covariant /eft exact functor in a good category with 


‘The case of sheaf cohomology illustrates this point well. The present theory extends from 
good categories of modules to arbitrary abelian categories along the lines of Section 8 below, and 
the cohomology theory of sheaves fits into this more general framework. One additive functor 
of interest with sheaves is the “global-sections” functor. Its derived functors can be formed with 
injective resolutions, built from “flabby” sheaves, but flabby sheaves as a practical matter are too 
big to be useful in computations. In the theory of several complex variables for example, one 
approach is to substitute “fine” sheaves in resolutions; these permit computations and fall under the 
abelian-category generalization of Theorem 4.27 below. 


6. Long Exact Sequences of Derived Functors 219 


enough injectives; this is the most important case in applications, and the other 
three cases in Figure 4.5 can be handled in similar fashion. Let F : C > C’ be an 
additive functor between good categories that is covariant left exact. A module 
M in C is said to be F-acyclic if F"(M) = 0 for alln > 1. Every module M 
that is injective in C is F-acyclic, since 0 —> M —» M —~+ 0 is an injective 
resolution of M from which we can see that F”?(M) = O forn > 1. An F-acyclic 
resolution of a module A in C is a resolution X = (A —> XT) in which X, is 
an F-acyclic module for all n > 0. 


Theorem 4.27. Let C and C’ be two good categories, let F’ be an additive 
functor from C to C’ that is covariant and left exact, and suppose that C has enough 
injectives. If a module A in C has an F-acyclic resolution X = (A —> XT) 
and if J = (A —> I7*) is any injective resolution of A, then any cochain map 
f : X — I with f_; = 1, induces an isomorphism F"(A) = H"(F(X)) for 
eachn > 0. 


REMARKS. Such a cochain map always exists and is unique up to homotopy, 
according to Theorem 4.16. Theorem 4.27 says that the derived functors of 
F on any module A can be computed from any F-acyclic resolution of A; it 
is not necessary to work only with injective resolutions. The same result as 
in the theorem holds with F,(A) = H,(F(A)) if F is contravariant and right 
exact. If F is covariant right exact or contravariant left exact and if C has 
enough projectives, then any chain map from a projective resolution of A to 
an F-acyclic resolution!” induces an isomorphism of the derived functors of A 
with the homology or cohomology of F of the F'-acyclic resolution. 


PROOF. The injective resolution is at our disposal, according to Corollary 
4.17. Using the hypothesis that C has enough injectives, choose for each n an 
injective J, containing X»y, let gy, : X, — J, be the inclusion, and make {J,} 
into an injective resolution of 0 with coboundary maps 0. Then replace / in the 
assumptions by J @ J and f by (f, g). The result is that we have reduced the 
theorem to the case that f is one-one. Changing notation, we may assume from 
the outset that the injective resolution is ] = (A —> IT) and that the chain map 
f : X — Tis one-one in each degree. 

Put Y, = [,/f,(Xn) = coker f,,. The sequence 


as ce, sere (x) 


is exact, and Theorem 4.24a shows that the sequence 
F®(,) —> F'n) —> F(X) 


For this situation, F-acyclic resolutions are understood to be chain complexes rather than 
cochain complexes. 


220 IV. Homological Algebra 


is exact for every k > 0. Since J,, and X,, are F-acyclic for n > 0, the end terms 
are O for all k > 1. Consequently Y,, is F-acyclic for all n > 0. 

Referring to («) for m and for n + 1, we see that the coboundary map from J, 
to In41 induces a compatible coboundary map from Y, to Y,41. Thus we may 
consider Y = (0 —+ Y*) as acochain complex with Y* = {Y¥,}n>0. Then the 
equations (*) for all n > 0, together with the coboundary maps, make 


f 


0— XxX >I >Y 0 (+) 


into a short exact sequence of complexes. Since X and / are exact, Corollary 4.8 
shows that Y is exact. 

If we apply F to the short exact sequence of complexes (+), we obtain a 
planar diagram 


0 — F(X) 2% FW) — F(Y) > 0 (4) 
whose rows are the result of applying F to (*), whose columns are complexes, 
and whose squares commutes. As usual we drop the row for n = —1, replacing 
it with a row of 0’s. Let us prove that (}) is in fact a short exact sequence of 
complexes. In fact, the result of applying F to (*) is the long exact sequence that 
begins 

0 —> F(X») —> FUn) —> Fn) — F(X). 


For n > 0, X, is F-acyclic. Thus F'!(X,) = 0, and the exactness for n > 0 
follows. Forn < —1, the rows of the diagram (7) are 0 and hence are exact. Thus 
(+) is a short exact sequence of complexes. 

We shall now prove that F(Y) = (0 —> F(Y7)) is exact. Combining this 
fact with the exactness of the rows of (+) and applying Corollary 4.8 will then 
yield H"(F(X)) = H”"(F(J)) for all n > 0. Since H"(FU)) = F"(A), this 
step will complete the proof. 

To prove that F(Y) = (0 —> F(Y7*)) is exact, define Z) = Yo and Z, = 
coker(Y,-1 — Y,) forn > 1. Let dy : Yn — Yn41 be the coboundary map. For 
each n > O, the complex 


0 — Y,/kerdy — Yn+1 > Zn41 —> 0 


is exact. Since kerd, = imaged,_, by exactness of Y, we have Y,,/kerd, = 
Y, /imaged,_; = Z,, and thus 


O — Z, — Yn41 — Znz1 — 0 (+1) 


is exact for alln > 0. 
Let us use (+7) to prove the preliminary result that Z, is F-acyclic for all 
n > 0. Forn = 0, Zo = Yo, and Yo is known to be F-acyclic. Proceeding 


6. Long Exact Sequences of Derived Functors 221 


inductively, suppose that Z,, is known to be F-acyclic. Applying Theorem 4.24a 
to (*T), we see that 


FE (Yn41) —> F*(Zn41) —> FM'(Z,) 


is exact for all n > O andall k > 0. Forn > Oandk > 1, the left end is 0 because 
Y,41 is F'-acyclic, and the right end is 0 because Z, is F-acyclic by the inductive 
hypothesis. Therefore the middle term is 0, Z,+1 is F-acyclic, and the induction 
is complete. 

Theorem 4.24a when applied to (}+) shows that 


0 —> F(Z,) —> F(Yn41) —> F(Znu1) —> F'(Zn) 


is exact for all nm > 0, and we now know that the term at the right end is 0. 
Therefore 


0 — F(Zn) — FWn+1) —> F(Zn41) — 0 (t) 


is exact for all n > 0. 
Now we can prove that the complex 


Oreo) Ey) Pe) Se PO) ($4) 


is exact at each module F'(Y,,). We know from Section 2 that we can merge two 
exact sequences 


0 > FVn41) > F(Zng1) 70 and 0 F(Zp41) > FWn42) > ++ 
into a single exact sequence 
da > J Vn) Po) oe Ag 


Consequently inductive application of (=) shows that the sequence 
0 — F(Zo) —> F(%\) — F(%2) — +--+ — FW na) — F(Zay1) — 0 


is exact for each n > 0. In addition, we know that Zo = Yo by definition. 
Therefore (£4) is exact at F(Y,,) for each n > 0, and the proof is complete. 


Theorems 4.22 and 4.24 produce a long exact sequence from one additive 
functor and a short exact sequence of modules. Although it may at first seem odd 
to do so, we can obtain a different long exact sequence by varying the functor 
and fixing the module. This result, given as Proposition 4.28 below, will be used 
in the next section in analyzing the Ext and Tor functors. 


222 IV. Homological Algebra 


Let C and C’ be two good categories, and let F, G, H be three additive functors 
from C toC’. For definiteness, suppose that F, G, H are covariant and right exact. 
Suppose that there is a natural transformation S of F into G and there is a natural 
transformation T of G into H. We say that the sequence 


Pot Goo# 
is exact on projectives if for every projective P in C, the sequence 
0 —> F(P) © G(P) > H(P) —> 0 


is exact. Analogous definitions are to be made with projectives or injectives for 
the three other kinds of derived functors as in Figure 4.5. 


Proposition 4.28. Let C and C’ be two good categories, let F, G, H be three 
additive functors from C to C’, suppose that F, G, H are covariant and right exact, 
and suppose that C has enough projectives. If there are natural transformations 
S:F — GandT :G — H such that the sequence F +, G 4 FH isexact 
on projectives, then the derived functors of F, G, H on each module A in C form 
a long exact sequence 


0 <— H(A) <— G(A) <— F(A) <— Hi(A) — Gi(A) — F(A) 


<— H(A) <— G2(A) <— F(A) <— H3(A) <— -:-. 
The passage from A to the long exact sequence is functorial in A. 


REMARKS. The same long exact sequence and functoriality hold with the 
arrows reversed and F and H interchanged if the three functors are contravariant 
and left exact. If F, G, H are contravariant and right exact or are covariant and 
left exact, then analogous conclusions are valid provided C has enough injectives 
and the natural transformations S and T are exact on injectives. 


Proor. If P = (P+ —> A) is a projective resolution of A, then the natural 
transformations S and T give us a planar diagram 


7. Ext and Tor 223 


in which the columns are complexes, the rows are exact because the sequence 


F > G HF isexact on projectives, and the squares commute because S$ 
and T are natural transformations. The construction of the long exact sequence 
then follows from Theorem 4.7. 

For the functoriality, suppose that g : A > A’ is a map between two modules 
of C. Let P = (P+ —+ A) and P’ = (P’t —+ A) be projective resolutions 
of A and A’, and use Theorem 4.12 to extend g to a chain map {g,} of P to 
P'. Then the planar diagrams as above for P and P’ can be embedded in a 
3-dimensional diagram in such a way that the various maps F'(¢g,), G(@,), and 
H (gn) connecting the diagram for P to the diagram for P’ make all squares 
commute. The functoriality now follows immediately from Theorem 4.10. 


7. Ext and Tor 


In this section we study the derived functors of Hom and tensor product. Although 
we shall treat each as carrying unital left R modules, where R is aring with identity, 
to abelian groups, the theory applies also to more complicated versions of Hom 
and tensor product, such as when one of the R modules in question is actually 
a bimodule for the rings R and S and the result of Hom or tensor product is an 
S module. Problems 9-11 at the end of the chapter address the situation with 
bimodules. 

We know that Hom,g(A, B) is acontravariant left exact functor of the A variable 
and a left exact covariant functor of the B variable. Thus we have two initial 
choices for inserting resolutions and creating derived functors, namely 


Ext(A, B) = H"(Homa(P, B)), with P = (A < P*) projective, 
and 
extp(A, B) = H"(Homa(A, 1)), with J = (B > I*) injective. 


Existence of the first one depends on having enough projectives in the category 
of the A variable, and existence of the second one depends on having enough 
injectives in the category of the B variable. Each of these, just as with Hom, 
depends on two variables, one in contravariant fashion and the other in covariant 
fashion. Thus Ext and ext are not functors of two variables in the strict sense of 
our definitions. Instead, they are examples of “bifunctors,” of which Homa(.-, -) 
is the prototype, and the main result, Theorem 4.31 below, in essence says that 
Ext and ext are naturally isomorphic as bifunctors, provided the first domain 
category has enough projectives and the second has enough injectives. Among 


224 IV. Homological Algebra 


other things this natural isomorphism will justify and explain how we were able 
to define cohomology of groups in more than one way.'? 

In the case of tensor product A ®r B, similar remarks apply. Here A is a 
unital right R module, and B is a unital left R module. The module A in a natural 
way is a unital left R° module, where R° is the opposite ring of R, and thus 
tensor product is to be regarded as defined on the product of two categories of 
left modules just as Hom is. We can regard tensor product as an actual functor in 
either variable, and the functor is covariant right exact in both cases. Again we 
have two initial choices for inserting resolutions and creating derived functors, 
namely 


Tors (A, B) = H"(P @p B), with P = (A < P*) projective, 
and 
tor’ (A, B) = H"(A @r P), with P’ = (B < P’*) projective. 


These exist if the domain categories have enough projectives. Both Tor and tor 
can be considered as covariant functors of two variables, or else as “bifunctors,” 
and one can show in the same way as for Ext and ext that Tor and tor are naturally 
isomorphic. There is no need to write out the details. It is customary to write Tor 
for the common value. 


Proposition 4.29. Let C and C’ be good categories of unital left R modules, 
and suppose that C has enough projectives. Then the contravariant left exact 
functors Home(-, B) from C to Cz and their derived functors Ext’p(-, B) have 
the following properties: 


(a) Whenever 0 + A’ — A — A” - Oisa short exact sequence in C, then 
there is a corresponding long exact sequence 


0 —> Hom,(A”, B) —> Home(A, B) —> Home(A’, B) 
—> Ext}(A”, B) —> Extk(A, B) —> Extk(A’, B) 


—> Ext?,(A”, B) —> Ext2,(A, B) —> Ext},(A’, B) > Ext}(A”, B) > --- 


in Cz for each module B inC’. The passage from short exact sequences in C to long 
exact sequences of derived functor modules in Cz is functorial in its dependence 
on the exact sequence in the first variable in the sense of Theorem 4.25 and is 
natural in the second variable in the sense that if a map 7 : B — B is given, then 
Hom(1, 7) defines a chain map from the long exact sequence for B to the long 
exact sequence for B. 


'3]¢ would add only definitions to our discussion to say precisely what a general bifunctor is and 
what a general natural transformation between bifunctors is, and we shall skip that detail, in effect 
incorporating the definitions into the theorem. 


7. Ext and Tor 225 


(b) If P is a projective in C and J is an injective inC’, then Exth(P, B) =0 = 
Ext, (A, J) for all n > 1 and all modules A in C and B inC’. 


(c) Whenever 0 — B’ + B — BY” -> Oisa short exact sequence in C’, then 
there is a corresponding long exact sequence 


0 —> Homa(A, B’) —> Homa(A, B) —> Homa(A, B") 
—> Ext},(A, B’) —> Ext)(A, B) — Ext (A, B”) 


—> Ext},(A, B’) —> Ext},(A, B) —> Ext},(A, B”) > Ext} (A, B’) > -:: 


in Cz for each module A in C. The passage from short exact sequences in C’ to 
long exact sequences of derived functor modules in Cz is functorial in the exact 
sequence in the second variable and is natural in the first variable in the sense that 
ifa map 7 : A > Ais given, then Hom(y, 1) defines a chain map from the long 
exact sequence for A to the long exact sequence for A. 


REMARKS. The naturality in the B parameter of the construction of the long 
exact sequence in (a) implies that Ext’, is a covariant functor of the second variable 
for fixed argument of the first variable. It implies also that all maps Ext’p(q, 1) 
commute with all maps Extp(1, 6). 


PROOF. For (a), Theorem 4.22b gives the exact sequence, and Theorem 4.25 
proves the functoriality in the first variable. For the naturality in the second 
variable, let 7 : B — B be given. The proof of Theorem 4.22 produces a 
short exact sequence of projective resolutions of A’, A, A” to which the functor 
in that theorem is then applied. We now have two such functors Hom,(-, B) 
and Home(-, B), and the maps within each image diagram are all of the form 
Hom(a, 1). The two diagrams fit into a 3-dimensional diagram, and the maps 
between the two diagrams are of the form Hom(1, 7). Since all maps Hom(a, 1) 
commute with all maps Hom(1, f), the 3-dimensional diagram is commutative. 
The corresponding long exact sequences are then related by a cochain map ac- 
cording to Theorem 4.10. 

For (b),0 < P < P < Ois a projective resolution of P, and hence any 
derived functor that is defined by projective resolutions is 0 in degree > 1. In 
addition, Proposition 4.19b shows that Homr(.-, /) is an exact functor, and hence 
its derived functors are 0 in degree > 1. 

For (c), we shall apply Proposition 4.28 in its version for contravariant left exact 
functors. Let yg : B’ > Band w: B — B” be the maps in the given short exact 
sequence, and let F', G, H be the functors with F(A) = Homa(A, B’), G(A) = 
Homa,(A, B), H(A) = Home(A, B”). Then we have a natural transformation S$ 
of F into G given by $4 = Hom(1, @) and a natural transformation T of G into 
H given by T, = Hom(1, y). Since 


0 —> Home(P, B’) 22+ Homa(P, B) —’> Homa(P, B”) —> 0 


226 IV. Homological Algebra 


is exact by Proposition 4.19a, the sequence 
F—->G—OouH 


is exact on projectives. Proposition 4.28 in its version for contravariant left exact 
functors then says that there is a long exact sequence 


0 —> F(A) — G(A) — A(A) — F(A) — G,(A) — AA) 
—> Fy)(A) — G2(A) — Ao(A) — F3(A) — -::- 


and that the passage to this long exact sequence is functorial in A. This much es- 
tablishes the long exact sequence in (c) and the naturality in the A variable. For the 
behavior in the second variable with A fixed, suppose that we have a second exact 
sequence 0) — B’ > B — B” — Othat maps to the given one by achain map f. 
Let F’, G’, H’ be the functors Homr(-, B’), Homer(-, B), Home(-, B”). We 
then get two horizontal planar diagrams of the kind in the proof of Proposition 
4.28, one for F’, G’, H’ and one for F,G, H. The maps within each of the 
two diagrams are maps in the A variable. The two diagrams embed in a 3- 
dimensional diagram with vertical maps Home(1, f), and the 3-dimensional 
diagram is commutative because all maps Hom(a, 1) commute with all maps 
Hom(1, 8). Application of Theorem 4.10 then completes the proof of functori- 
ality in the exact sequence in the second variable. 


Proposition 4.30. Let C and C’ be good categories of unital left R modules, 
and suppose that C’ has enough injectives. Then the covariant left exact func- 
tors Home(A, -) from C’ to Cz and their derived functors ext,(A, -) have the 
following properties: 


(a) Whenever 0 — A’ — A -— A” - O isa short exact sequence in C, then 
there is a corresponding long exact sequence 


0 —> Hom,(A”, B) —> Homg(A, B) —> Homa(JA’, B) 

—> extp(A”, B) —> extp(A, B) —> exth(A’, B) 

—> ext,(A”, B) —> ext,(A, B) —> ext?,(A’, B) > ext},(A", B) > --- 
in Cz foreach module B inC’. The passage from short exact sequences in C to long 
exact sequences of derived functor modules in Cz is functorial in its dependence 
on the exact sequence in the first variable and is natural in the second variable in 


the sense that if a map 7 : B — B is given, then Hom(1, 7) defines a chain map 
from the long exact sequence for B to the long exact sequence for B. 


7. Ext and Tor 227 


(b) If P is a projective in C and J is an injective in C’, then ext,(P, B) =0 = 
ext’p(A, J) for all n > 1 and all modules A in C and B inC’. 


(c) Whenever 0 > B’ > B > B” — Oisa short exact sequence in C’, then 
there is a corresponding long exact sequence 


0 —> Hom,(A, B’) —> Homag(A, B) —> Homae(A, B”) 


—> ext}, (A, B’) —> ext,(A, B) —> extp(A, B”) 


—> ext?(A, B’) —> ext},(A, B) —> ext,.(A, B”) > ext3,(A, B) > --- 


in Cz for each module A in C. The passage from short exact sequences in C’ to 
long exact sequences of derived functor modules in Cz is functorial in the exact 
sequence in the second variable and is natural in the first variable in the sense that 
ifa map 7 : A > Ais given, then Hom(7, 1) defines a chain map from the long 
exact sequence for A to the long exact sequence for A. 


REMARKS. The naturality in the A parameter of the construction of the long 
exact sequence in (c) implies that ext’, is acontravariant functor of the first variable 
for fixed argument of the second variable. It implies also that all maps ext’, (a, 1) 
commute with all maps ext’, (1, B). 


PROOF. The proof of (c) is a simple variant of the proof of Proposition 4.29a, 
the proof of (b) is a simple variant of the proof of Proposition 4.29b, and the proof 
of (a) is a simple variant of the proof of Proposition 4.29c. 


Propositions 4.29 and 4.30 show that Ext and ext, as functors of the first variable 
and as functors of the second variable, generate the same long exact sequences, 
the first under the assumption that C has enough projectives and the second under 
the assumption that C’ has enough injectives. Theorem 4.31 will show that Ext 
and ext may be treated as equal if both assumptions are satisfied. It is customary 
therefore to use Ext as the notation in both cases; thus Ext exists if either C has 
enough projectives or C’ has enough injectives. In both cases, Ext has a long 
exact sequence in the first variable and another long exact sequence in the second 
variable. 


Theorem 4.31. Let C and C’ be good categories of unital left R modules, 
and suppose that C has enough projectives and C’ has enough injectives. Then 
Extk(-, -) and ext,(-, -) are naturally isomorphic from C x C’ to Cz in the 
sense that for each n > 0 and each pair of modules (A, B) inC x C’, there exists 
an isomorphism T(n,4,8) in Homz(Extp(A, B), extp(A, B)) such that if g is in 


228 IV. Homological Algebra 


Homa,(A, A’) and w is in Home(B, B’), then the diagrams 


Tin, A,B) 


Ext’?,(A, B) ext’,(A, B) 


+ 4 
Ext” (9,1) | | ext" (g,1) 


Tn,A!.B) 
———— 


Ext’,(A’, B) ext,(A’, B) 


and 
Tn, A,B) 
——> 


Ext’,(A, B) ext’,(A, B) 


Extn) | [era 


Tn, A,B!) 
Ss 


Ext’,(A, B’) ext’,(A, B’) 


commute. 


REMARKS. The reader will be able to observe that a certain part of this proof 
amounts to showing that 3-dimensional diagrams in the shape of a cube having 
5 faces equal to commuting squares and having suitable hypotheses on the maps 
automatically have their sixth face equal to a commuting square. The hypotheses 
concerning the faces and the maps come from Propositions 4.29 and 4.30, as well 
as induction. We shall not try to abstract a general result of this kind, however. 


PROOF. We induct on n forn > 0. Several steps are involved in the proof, and 
we complete all of them for a particular n before going on ton + 1. The steps for 
a particular n are 


(i) to define 7(,,4,8) in the presence of an injective J and a one-one map 
ju: B — I and to observe that T(,,4,g) is an isomorphism, 
(ii) to show that the same 7,,, 4,8) results independently of the choice of /, 
(iii) to prove the commutativity of the second diagram in the statement of the 
theorem, and 
(iv) to prove the commutativity of the first diagram in the statement of the 
theorem. 


The first base case of the induction is n = 0, for which we take Tio, 4,z) to be the 
identity on Homar(A, B). Then (i) through (iv) are immediate. 

The other base case of the induction isn = 1. Let (A, B) be given. An 
injective J and a one-one map yz : B — J exist as in (i) because C’ has enough 
injectives. Then we have an exact sequence 


oe ere ogee at (x) 


in which C = I/(B) and v is the quotient map. We know from Propositions 
4.29b and 4.30b that Extp(A, I)V=O= ext} (A, I). Therefore Propositions 


7. Ext and Tor 229 


4.29c and 4.30c give us exact sequences 


Homa(A, 1) 7" Homa(A, C) => Ext),(A, B) ——> 0 


and 
Hom(1, , 
Homa(A, 1) 7” Homa(A, C) —“2> ext!,(A, B) ——> 0 
in which wg and @, are suitable connecting homomorphisms. We define 
T(,A,B) = ®e,0(@ £0) 1. This definition is meaningful, since the exactness of the 
two sequences gives 


(we.9) 10) = kerwg.o = Hom(1, v)(Home(A, /)) = ker a, 0; 


by an analogous computation, WE0(We.9) | is a well-defined function, and it is 
evidently a two-sided inverse. Thus 7/1, 4,2) is an isomorphism. This completes 
step (i). 

In order to be able to handle steps (ii) and (iii) without being repetitive, let a 
map wy : B > B’ be given. For (ii), B’ will be B, and w will be the identity on 
B. For (iii), B’ and y will be general. Given w and one-one maps uw: B > I 
and y’ : B’ + I’, we can form the exact rows and the first column of the diagram 


lL 


0 > B > I cero > 0 


1odoa o 
> [' as > C’ > 0. 


If we think of J and J’ as extended to injective resolutions, Theorem 4.16 allows 
us to fill in a cochain map from the one extension to the other, and the first new 
step of that cochain map is f. If we define f = v’ fv—!, then f is well defined 
because 


v' fv '(0) =v’ fkerv =v’ f image w 
=v’ fu(B) = v'w'b(B) = 0((B)) = 0, 


and the squares of the diagram (**) now commute. Continuing with the effort 
to cut down on repetitive arguments, let k > 1 be an integer that will be 1 when 
n = 1 and will be different later in the proof. Applying Proposition 4.29c to (+) 
gives us a commuting square 


Ext’"(A,C) “Ss Ext&(A, B) 
pw.) [extaw (7) 


Ext®!(4,C’) “> Ext&(A, BY) 


230 IV. Homological Algebra 


for k > 1, and Proposition 4.30c gives us a similar commuting square for ext for 
k>1. 

For each module in the diagram with Ext when k = 1, there is a map to the 
corresponding module in the diagram with ext. These maps are T-1,4,c) for 
the upper left and T~1,4,c’) for the lower left. The maps for the upper right and 
lower right depend on the step of the argument. 

For step (ii), we are taking B’ = B, and the maps at the right are the two 
versions of T(x,4,8), one for the injective J and one for the injective J’. Let 
us call them T(x, 4,8) and ThA.B): We are to prove that Th. A.B) Ext*(1, Ww) = 
ext*(1, W)Tx,a,p) for yw = 1. The relevant definitions are 


= 
Tk, A,B) = e,k-1) Tk-1,A,C)©(E, k-1) 
-1 
and Th, A.B) = Oe p—1) Tk-1,4.C) (Ce k—1)) , 
or equivalently 


Tk, A,B) O\E,k-1) = Oe, k—-1) Tk-1,4,€) 
f / / 
and Te ABE k-1) = ek) TK-1,4,C): 


Since T,x_1,4,c) and Tx~1,4,c’) are known inductively to be well defined and to 
satisfy (iii), we have ext*!(1, STK-1,4,c) = Tee-1,4,c7+) Ext! (1, f). Thus 


ext’ (1, W) Ta, 4, BOE k-1) = ext (1, Woe kr Te-1,4.0) 
= We k-1) ext’ (1, f)Ta-1,4,c) = We K—1) Tk-1,4.C’) Ext? (if) 
Fale k 
= Th A.B) e4—1 Ext, f) = Th 4,8) Ext, Woe cn. 


Since Ext*(1, y) = 1 and ext*(1, w) = 1 when w = 1, step (ii) follows for 
n = 1, i.e., T,4,B) is well defined. 

For step (iii), we are allowing general B’, and the maps at the right between 
the two versions of (+) are the well-defined isomorphisms T(x, 4,8) and Tix, 4,8’). 
We are to prove that T(x, 4,3’) Ext‘ (1, wv) = ext*(1, wW)T x, A,B). The argument in 
the previous paragraph applies if we change Tj, 4 ,) Systematically to Tix, 4,8’) 
and take into account that wz,,—1) is onto, and step (iii) follows forn = 1. 

For step (iv), let g : A — A’ be given. The conclusion of Proposition 4.29c 
that the dependence is natural in the first variable gives us a commuting square 


k-1 


Ext®"(A,C) “5 Ext&(A, B) 
Bat". [ox (1) (+7) 


Ext"(4’,C) —“* Exth(A’, B) 


7. Ext and Tor 231 


for k > 1 and for suitable connecting homomorphisms w,¢ ,_; and Or. 41> and 
Proposition 4.30c gives a similar commuting square for ext for k > 1. For each 
module in the diagram with Ext when k = 1, there is a map to the corresponding 
module in the diagram with ext. These maps are T;x._1,4,c) for the upper left, 
T-1,A',c) for the lower left, Tz, 4,8) for the upper right, and T(z, 4’, 8) for the lower 
right. We are to prove that T(x, 4,2) Ext* (g, 1) = ext*(g, 1) T(x, ',B). The relevant 
definitions are 


Tk, A, ByO(E.k-1) = %e,k-1) Tk-1,4,C) 
and Tek, A'B) OE k-1) = ®e,k—-1) Mk-1,4',.0)- 


Since Tyx—1,4,c) and T(._-1,4’,c) are known inductively to satisfy (iv), we have 
ext’!(g, 1)T-1,A4',C) = Tk-1,A,C) Ext*!(g, 1). Thus 


k k 
ext' (9, 1) TK," Bye 4-1) = XU (G, DO(e p_1) TK-1.4.0) 
k-1 = 
= We,4-1) ext? (9, I) Te-1,4/,0c) = Oe k-1) Te-1,4,C€) Ext’ |g, 1) 
= Ti, a,B)(E,k—-1) Ext’ "9, 1) = Te, 4,8) Ext'(Y, ote p-1)- 


Since w(¢ 41) iS onto, step (iv) follows for n = 1. This completes the proof for 
n=1., 

For the inductive step, suppose that steps (i) through (iv) have been carried out 
for some n > 1. Let us carry out step (i) for stage n + 1. For a given B, we know 
from Propositions 4.29b and 4.30b that Extp(A, J) = 0 = ext2(A, /). Hence 
Propositions 4.29c and 4.30c give us exact sequences 


Q ——> Ext’,(A, C) —“"> Ext*!(a, B) —> 0 


and 
0 ——> ext’,(A, C) —“"> ext#!(A, B) —> 0. 


In other words, wz ,n and we, are isomorphisms. If we put 
-1 
Tin+1,4,B) = enT(n,A,C)YE. n> 


then 7(,41,4,8) iS an isomorphism of Ext*(A, B) onto ext! (A, B). This 
completes step (i) for stage n + 1. 

We now refer back to our argument for n = 1 and put k = n + 1 throughout. 
Tracing matters through, we see that the argument carries out steps (ii) through 
(iv) for stage n + 1. This completes the induction and the proof. 


232 IV. Homological Algebra 


8. Abelian Categories 


Not all situations in which one wants to apply homological algebra are limited to 
good categories of unital left R modules for some ring R. We have mentioned 
sheaves as one example, and we shall develop some properties of sheaves in Chap- 
ter X. Implicitly we have carried along a second example: all chain complexes 
within a good category, with chain maps as morphisms, form a category in which 
short exact sequences have remarkable properties, such as those in Theorems 4.7 
and 4.10. 

A setting to which one can generalize well such basic parts of homological 
algebra is that of “abelian categories,’ which we define in this section. It is 
advisable not to require that the objects in an abelian category actually be sets 
of individual elements; otherwise there is little chance that the notion of abelian 
category could be self dual. The morphisms of the category are then effectively 
all we have to work with, since a morphism already determines its “domain” and 
“range.” If X and Y are objects, then a morphism in Morph(X, Y) need not be a 
function, but at least Morph(X, Y) is a set with elements to it. Since objects no 
longer have elements, books usually suppress the objects in the discussion to the 
point of referring to things like kernels and cokernels as morphisms rather than 
objects. It is perhaps more comfortable to think of a kernel as a pair, consisting of 
an object and a morphism into another object, rather than just as the embedding 
morphism, and we shall follow the more comfortable convention temporarily. 

We introduce the notion of “abelian category” in stages. We begin with some 
definitions and remarks that make sense in a general category. First of all, let 
us have names for X and Y when referring to morphisms in Morph(X, Y) that 
do not require us to think in terms of functions. The convention is that if u is 
in Morph(X, Y), then X is the domain of u and Y is the codomain. We allow 
ourselves to write compositions of morphisms as gf or as go f. 

Next, itis possible to generalize usefully the notions of “one-one” and “onto” to 
make them applicable in any category. The definitions are in terms of cancellation 
laws. In the category C, a morphism u € Morph(X, Y) is amonomorphism’* if 
for any f and g in the same set Morph(W, X) such that uf = ug, it follows that 
Jf =. Any isomorphism is certainly amonomorphism. The composition of two 
monomorphisms is a monomorphism. In fact, if u and v are monomorphisms 
with vuf = vug, then uf = ug because v is a monomorphism, and f = g 
because u is amonomorphism. If m is a monomorphism in Morph(X, Y) and u 
is any morphism in Morph(Y, X) such that mu = ly, then m is an isomorphism. 
In fact, mu = ly implies mum = lym = m, which implies um = 1x, since m 
is a monomorphism; therefore u is a two-sided inverse to m. 


'4Some authors use the word “monic” or the word “mono” as an adjectival form of this noun. 


8. Abelian Categories 233 


The morphism u € Morph(X, Y) is an epimorphism!> if for any f’ and g’ 
in the same set Morph(Y, Z) such that f’u = g’u, it follows that f’ = g’. Any 
isomorphism is an epimorphism. The composition of two epimorphisms is an 
epimorphism. If e is an epimorphism in Morph(X, Y) and u is any morphism in 
Morph(Y, X) such that we = 1y, then e is an isomorphism. 

Finally a zero object 0 in a category C is an object such that for each X in 
Obj(C), each of Morph(X, 0) and Morph(0, X) has exactly one member. It is 
immediate that any two zero objects are isomorphic: if 0 and 0’ are zero objects, 
then Morph(0, 0) and Morph(0’, 0’) each have just one member, which must be lo 
and 19 in the two cases; the composition of the member of Morph(0, 0’) followed 
by the member of Morph(0’, 0) must be 19, and the composition in the other order 
must be Iq, and the isomorphism of 0 with 0’ has been exhibited. 

Suppose that a zero object exists. Since the composition law for morphisms 
in C insists that the composite of a member of Morph(X, 0) and a member 
of Morph(0, Y) be in Morph(X, Y), it follows that Morph(X, Y) has a distin- 
guished member, which we denote by Oxy. This is called the zero morphism of 
Morph(X, Y). By associativity it satisfies fOxy = Oxz forall f €¢ Morph (Y, Z) 
and Oxyg = Owy for all g € Hom(W, X). Since Morph(0, 0) has just one 
element, we have Ooo = lo. If X is any other object such that Morph(X, X) has 
Oxx = lx, then X is a zero object; in fact, the equalities 0x909x = Oo0 = 1o and 
O0ox0x0 = Oxx = ly show that X and 0 are isomorphic. 

An additive category C is a category with the following three properties: 


(i) Chas a zero object, 
(ii) the product and the coproduct’® of any two objects in C exists in C, 
(iii) each set Morph(X, Y) is an abelian group with the property that the 
operation is Z bilinear in the sense that if the operation is + and if f, f’ 
are arbitrary in Morph(X, Y) and g, g’ are arbitrary in Morph(Y, Z), then 


{16 


@teyo(f+fyHegofteoftgof +e of’ 
and Po(=7) Seg) of =] oe 25h). 


If C is an additive category, then so is the opposite category C°??; this fact 
will enable us to use duality arguments occasionally. We shall henceforth write 
Hom(X, Y) in place of Morph(X, Y) for additive categories. 

The zero morphism Oxy of Hom(X, Y) is the additive identity 0 of the abelian 
group Hom(X, Y). In fact, Ooy is the additive identity of Hom(0, Y), since 
Hom(0, Y) has just one element. Therefore Oxy = Opy0x0 = (Ooy + Ooy)0x0 = 
OoyOxo0 + OoyOx0 = Oxy + Oxy, and we obtain 0 = Oyy. 


'5Some authors use the word “epi” as an adjectival form of this noun. 
'©These are defined in Section IV.11 of Basic Algebra. They are always unique up to canonical 
isomorphism when they exist. 


234 IV. Homological Algebra 


In an additive category a morphism u in Hom(X, Y) is a monomorphism if 
whenever uf = 0 with f in some Hom(W, X), then f = 0; a morphism wu in 
Hom(X, Y) is an epimorphism if whenever f’u = 0 with f’ in some Hom(Y, Z), 
then f’ = 0. 

This much structure forces products and coproducts to amount to the same 
thing in an additive category. The precise result is as follows. 


Proposition 4.32. In an additive category, let (C, pa, pz) be a product of two 
objects A and B. Then there exist unique i, € Hom(A, C) andig € Hom(B, C) 
such that 


Paia=1a, pPaip=1p, tapatispps = lc. 
These satisfy paig = 0 and pgi, = 0, and (C, ia, ig) is acoproduct of A and B. 


REMARKS. 

(1) Since the defining properties of an additive category are self dual, any 
coproduct has a similar structure and becomes a product. The proof in effect will 
show more—that whenever there are data A, B, C,i,,ig, pa, pp satisfying the 
displayed identities, then (C, pa, pg) is a product of A and B, and (C, i,, ig) is 
a coproduct. Thus a product/coproduct can be recognized without reference to 
other objects in the category. 

(2) To emphasize the analogy with modules or vector spaces, we write A @ B 
for a product or coproduct of A and B in C and call it the direct sum of A and 
B. The notation is understood to carry the morphisms 74, ig, pa, pgp along with 
it. The direct sum is unique up to an isomorphism that carries the one set of 
morphisms 74, ig, Pa, Pp to the other. 


PROOF. To the pair 14 € Hom(A, A) and 0 € Hom(A, B), the product C 
associates a unique i4 € Hom(A, C) with pai, = 1,4 and pgi, = 0. Similarly 
the coproduct associates a unique ig € Hom(B, C) with paig = 0 and pgig = 
1g. Computing with the aid of the Z bilinearity and associativity, we have 


Pa@aPa + igps) = lapat+ Ops = Pa 
and Pa(iapa +ipps) =Opat+|lepp= Ps. 


Therefore h = i4p4 +igppz is a member of Hom(C, C) with the property that 
Pah = pa and pgh = pz. Since I¢ is another member of Hom(C, C) with this 
property, the assumed uniqueness shows that h = 1c. This proves the displayed 
formulas in the proposition and the formulas p4ig = 0 and pgia = 0. 

For uniqueness of i4 and ig, suppose that i’, and i’, satisfy i’, p4+i,pB = lc. 
Right multiplication by i, givesi4 = lcia = (i,pat+ippa)ia =i la +i,0= 
i’,, and similarly ig = ip. 


8. Abelian Categories 235 


To see that (C,i,4,ig) is a coproduct of A and B, let f € Hom(A, X) and 
g € Hom(B, X) be given, and define h = fp,4+ gppz. This is in Hom(C, X), has 
hig = fpaiat+ gppia = fla = f, and similarly has hig = g. For uniqueness 
suppose that k is in Hom(C, X) withki4 = f andkig = g. Thenkiagpa = fpa 
and kig pp = gppz. Addition gives 


k=klco =k(iqpat+ipps) = fpat spp =A, 


and uniqueness is proved. 


For an additive category C, the notions of the kernel and cokernel of a morphism 
are defined by universal mapping properties. Problems 18-22 at the end of 
Chapter VI of Basic Algebra discussed universal mapping properties abstractly, 
saying what they are in a general context. For current purposes it is enough to 
know that what a universal mapping property produces (if it produces anything 
at all) is a pair consisting of an object and a morphism, and moreover the pair is 
automatically unique (if it exists) up to canonical isomorphism. 

We allow ourselves to write morphisms as arrows in any of the customary ways 
for functions. Thus a member u of Hom(A, B) may be written as A —+ B,and 
a composition of u followed by a morphism v € Hom(B, C), which has been 
written as v o u or as vu, may be written as A Be: 

If A —> B is a morphism in the additive category C, then the kernel of u, 
denoted by ker uw, is a pair (K, i) with i €e Hom(K, A) such that the composition 


K —>» A —> B has ui = 0 and such that for any pair (K’, i’) with i’ in 
Hom(K’, A) for which ui’ = 0, there exists a unique gy € Hom(K’, K) with 
ig =i’. See Figure 4.6. It is customary to drop all mention of K in the definition 
of kernel, saying that the kernel is 7, since any mention of i carries along K as the 
domain of i; we shall adopt this abbreviated terminology shortly but shall refer 
to the pair (K, 7) as the kernel for the time being. 


K > A > B 
aN 

Oo" we 

a 


FIGURE 4.6. Universal mapping property of a kernel (K, i) of u. 
The brief form of the definition of kernel is that u o (ker u) = 0 and 
ui’ =0 implies i’ = (keru)o@ uniquely. 


The kernel of u is determined only up to an isomorphism applied to K; that is, i 
is determined only up to right multiplication by an isomorphism. The condition 


236 IV. Homological Algebra 


for (K, i) to be a kernel is equivalent to the exactness of the sequence of abelian 
groups 
0 ——> Hom(K’, K) @s Hom(K’, A) 22> Hom(K’, B). 

In fact, ui = 0 makes the sequence a complex, the existence of g produces exact- 
ness at Hom(K’, A), and the uniqueness of g produces exactness at Hom(K’, K). 

Similarly the cokernel of u, denoted by coker wu, is a pair (C, p) with p in 
Hom(B, C) such that the composition A =“ B=" C has pu = 0 and such 
that for any pair (C’, p’) with p’ in Hom(B, C’) for which p’u = 0, there exists 
a unique Ww € Hom(C,C’) with wp = p’. See Figure 4.7. It is customary to 
drop all mention of the object C in the definition of cokernel, saying that the 
cokernel is p, since any mention of p carries along C as the codomain of p; we 
shall adopt this abbreviated terminology shortly but shall refer to the pair (C, p) 
as the cokernel for the time being. 


ei ae A 
ve Va 

V Zz 

C’ 


FIGURE 4.7. Universal mapping property of a cokernel (C, p) of u. 


The brief form of the definition of cokernel is that (coker uv) o u = 0 and 
pu=0 implies p' =wo(cokeru) uniquely. 
The cokernel of u is determined only up to an isomorphism applied to C; that is, p 
is determined only up to left multiplication by an isomorphism. The condition for 
(C, p) to be a cokernel is equivalent to the exactness of the sequence of abelian 
groups 


0 ——> Homc,c’) 222s Hom(B, C’) —2%> Hom(A, C’. 


In fact, pu = 0 makes the sequence a complex, the existence of y produces exact- 
ness at Hom(B, C’), and the uniqueness of y produces exactness at Hom(C, C’). 


Proposition 4.33. Let C be an additive category. If an element u of Hom(A, B) 
has a kernel (K, i) and if m € Hom(B, B’) is a monomorphism, then (K, i) is 
also a kernel of mu. If u has a cokernel (C, p) and if e € Hom(A’, A) is an 
epimorphism, then (C, p) is also a cokernel of we. Briefly 


ker(mu) = keru and coker(ue) = coker wu. 
REMARK. We can safely omit the proof of any dual statement about addi- 
tive categories, since the dual follows by expressing the original argument as a 


diagram, reversing all the arrows, and writing down the argument that the new 
diagram represents. 


8. Abelian Categories 237 


PROOF. We test whether i = keru is a kernel of mu. We know that (mu)i = 
m(ui) = 0. Suppose that mui’ = 0 with i’ € Morph(K’, A). Since m is a 
monomorphism, ui’ = 0. Because i is a kernel of u, we obtain i’ = ig fora 
unique g € Morph(K’, K). Hence i is a kernel of mu. The statement about 
cokernels is dual. 


Proposition 4.34. Let C be an additive category. Ifan element u of Hom(A, B) 
has a kernel (K, i), theni is amonomorphism. Dually if u has a cokernel (C, p), 
then p is an epimorphism. 


PROOF. Suppose that u has a kernel (K,i). For any object K’, the zero 
morphism i’ = 0 of Hom(K’, A) has the property that ui’ = 0. The uniqueness 
property of the kernel says that the g in Hom(K’, K) with ig = i’ is unique. 
Evidently g = 0 is one such choice and hence is the only such choice. Thus if f 
in Hom(K’, K) hasif = 0, then f = 0. Therefore i is a monomorphism. 


Propositions 4.33 and 4.34 give a first hint that the notation (K, 7) for the kernel, 
which we know is redundant, may also be inconvenient; it would be far simpler 
to refer to the kernel as i, and analogously for cokernels. Then Proposition 4.33 
could truly be stated as the displayed formulas in its statement, and Proposition 
4.34 would have the tidier statement that every kernel is a monomorphism and 
every cokernel is an epimorphism. Let us therefore now allow ourselves to regard 
kernels and cokernels as morphisms, rather than pairs consisting of an object and 
a morphism. With this convention in place, we always have u o (keru) = 0 and 
(cokeru) ou = 0. 


Proposition 4.35. Let C be an additive category, and let u be in Hom(A, B). If 
u has a kernel and ker u has a cokernel, then coker(ker w) is a kernel of u. Briefly 


ker(coker(keru)) = keru. 
Dually if u has a cokernel and coker u has a kernel, then 
coker(ker(coker u)) = coker u. 


PROOF. Let (K, i) be akernel of u, and let (C, p) be acokernel of i. We are to 
show that i is a kernel of p. For the existence step, suppose that i’ in Hom(K’, A) 
has pi’ = 0. We are to show that i’ factors as i’ = ig for some unique ¢ in 
Hom(K’, K). We know that ui = 0. Since p = coker i, u factors as u = wp for 
some w in Hom(C, B). Then ui’ = (Wp)i’ = w(pi’) = 0. Since i = keru, i’ 
factors as i’ = ig as required. This proves existence of ¢. 

For the uniqueness step, suppose that pi’ = 0 for some i’ in some Hom(K’, A). 
If i’ were to have two distinct factorizations, say as i’ = ig = i@, then i could 
not be a monomorphism, in contradiction to Proposition 4.34 and the fact that 
i = keru. This proves uniqueness of g. 


238 IV. Homological Algebra 


An abelian category C is an additive category with the following two proper- 
ties: 


(iv) every morphism has a kernel and a cokernel, 
(v) every monomorphism is a kernel, and every epimorphism is a cokernel. 


It is evident that the opposite category of any abelian category is abelian. Thus 
we can continue to use duality arguments. 

Property (iv) is certainly desirable if one wants to have a theory involving ho- 
mology and cohomology. Property (v) may be viewed as a converse to Proposition 
4.34; some other authors use a different but equivalent formulation of this axiom. 
The objective is to have a generalization of the kind of factorization that one has 
with homomorphisms of abelian groups: any homomorphism factors canonically 
as the product of the canonical passage to the quotient by the kernel, followed by 
an isomorphism of this quotient onto the image of the homomorphism, followed 
by the inclusion of the image into the range. 


Proposition 4.36. In any abelian category, every morphism that is both a 
monomorphism and an epimorphism is an isomorphism. 


Proor. If f € Hom(K, A) is a monomorphism, then f = ker g for some g 
in some Hom(A, B) by (v). This fact implies that gf = go (kerg) = 0. If f 
is also an epimorphism, then the equality gf = 0 implies that g = 0. Hence 
f = kerO,g. Taking K’ = A and i’ = 1, in Figure 4.6, we have Oi’ = 0 and 
thus have 14 = fg for some g in Hom(A, K). Thus the monomorphism f has 
a right inverse and must be an isomorphism. 


Lemma 4.37. In an abelian category C, every monomorphism is the kernel of 
its cokernel, and every epimorphism is the cokernel of its kernel. 


PROOF. Ifm isamonomorphism, then (v) says thatm = ker u forsomeu. Sub- 
stituting into the first conclusion of Proposition 4.35, we obtain ker(cokerm) = 
m. If e is an epimorphism, then (v) says that e = coker u for some u. Substituting 
into the second conclusion of Proposition 4.35, we obtain coker(ker e) = e. 


Proposition 4.38. In an abelian category C, any morphism f factors as f = me 
for a monomorphism m and an epimorphism e. Here one such factorization is 
given by 

m = ker(coker f) and e = coker(ker f). 


Any other such factorization f = m’e’ has the property that there is some 
isomorphism x with e’ = xe and m’x =m. 


8. Abelian Categories 239 


PROOF. Put m = ker(coker f). Since (coker f) f = 0, the brief form of the 
definition of kernel gives f = me for some e. We are going to prove that e is an 
epimorphism. Thus suppose that re = 0 for some morphism r. The brief form 
of the definition of kernel shows that e = (kerr )e’ for some morphism e’. Then 
we have 


f =me=mikerr)e’ = me’, where m’ = mkerr. 


Being a kernel, kerr is a monomorphism. As the composition of two monomor- 
phisms, m’ is a monomorphism. Lemma 4.37 shows that m’ = ker p’, where 
p’ =cokerm’. 

Put p = cokerm. The definition of m and the second identity of Proposition 
4.35 gives p = coker(ker(coker f)) = coker f. Since m’ = ker p’, we have 
p’m' = 0. Hence p’ f = p'm’e' = 0. Since p = coker f, the brief form of the 
definition of cokernel shows that p’ = sp for some s. Thus p’m = spm = 0, the 
latter equality holding because p = cokerm. Since m’ = ker p’, the brief form 
of the definition of kernel gives m = m’t for some ft. 

Resubstituting for m’ gives m = m’t = m(kerr)t. Since m is a monomor- 
phism, we can cancel and obtain ly = (kerr)t, where X is the codomain of kerr. 
In other words, kerr has a right inverse. Being a monomorphism, it must be an 
isomorphism. Since any morphism v has v ker v = 0, we obtain r kerr = 0 and 
conclude that r = 0. Therefore e is an epimorphism, as asserted. 

Since e is an epimorphism, Lemma 4.37 gives e = coker(ker e), and Propo- 
sition 4.33 gives kere = ker(me) = ker f. Therefore e = coker(ker f). This 
completes the proof of existence of the decomposition. 

For uniqueness, suppose that f = m’e’ for a monomorphism m’ and an 
epimorphism e’. Proposition 4.33 gives ker f = ker(m’e’) = kere’, as well 
as ker f = ker(me) = kere, the understanding being that these equalities hold 
up to an isomorphism on the right. Set u = kere and u’ = kere’; then u = u’w 
for some isomorphism w. Since e and e’ are epimorphisms, Lemma 4.37 gives 
e = cokeru and e’ = cokeru’. Since m’ is a monomorphism, the equality 
0 = ftker f) = fu = m’e‘u implies that e’u = 0; by the brief form of the 
definition of coker u as a cokernel, e’ factors as e’ = xe for a unique x. Similarly 
the equality 0 = f ker f = fu’ = meu implies that eu = 0; by the brief form of 
the definition of coker uv’ as a cokernel, e factors as e = xe’ for a unique x’. Then 
e = x’e’ = x’'xe; since e is an epimorphism, x’x is the identity on its domain. 
Similarly e’ = xe = xx’e’, and it follows that xx’ is the identity on its domain. 
Consequently x is an isomorphism. Multiplying e’ = xe by m’ on the left gives 
me = f =m'e' = m'xe; since e is an epimorphism, m = m’x. This completes 
the proof. 


With this canonical factorization in hand, we introduce two terms that will 


240 IV. Homological Algebra 


simplify the definition of “exact sequence.” We define the image and coimage 
of f = me in Hom(A, B) by 


m = image f and e = coimage f. 


In words, the image of any morphism is its monomorphism factor, and the coimage 
is its epimorphism factor; in particular, a monomorphism is its own image, and 
an epimorphism is its own coimage.!” Let us see what the factorization and these 
formulas say in terms of diagrams. We write (K, i) for the kernel of f and (C, p) 
for the cokernel of f. Let J be the codomain of e, which equals the domain of 
m. In terms of a diagram, the situation for f is then given by 


i=kere e =cokeri m=ker p p=cokerm 
> 


K > A > I B 


=ker f =coimage f =image f =coker f 


> C, 


The top row of labels explains the relationships among /, e, m, p, and the bottom 
row of labels relates 7, e, m, p to f. The morphism f itself is the composition of 
the two morphisms in the center. 

In a good category of modules, we can interpret this diagram in terms of the 
two short exact sequences 


0 —— K Per A ese > A/imagei ——~> 0, 


0 ——> A/imagei —~> B —?>> C 0; 


which we can merge into a single 6-term exact sequence 


0 Seas ee ee 


> C > 0. 


Now we can define complexes and exact sequences for abelian categories, and 
we can readily check that the new definitions are consistent with the definitions 
for good categories of modules. A chain complex is a doubly infinite sequence of 
morphisms with decreasing indexing such that the consecutive compositions are 
defined and are 0. If f € Hom(A, B) and g € Hom(B, C) are given morphisms, 
then the sequence 


Aa pS tee 


is exact at B if image f = ker g, or equivalently if coker f = coimage g. As 
usual in the subject of abelian categories, the equality sign here means “can be 
taken as.” In more detail if f and g decompose as f = me and g = m’e’, image f 
is defined to be m, and ker g equals kere’. Thus the condition for exactness is 


'’The term “coimage” is not really needed for recognizing exact sequences, but it makes any 
implementation of duality more symmetric. 


8. Abelian Categories 241 


that m be a kernel of e’. Since u(keru) = 0 for any morphism u, exactness at 
B implies that e’m = 0. Then gf = m'e’me = 0, and we see that the given 
sequence (when extended by 0’s at each end) is a complex. 

Exactness of any finite or infinite sequence of morphisms whose consecutive 
compositions are defined means exactness at every object X in the sequence 
for which there is an incoming morphism in some Hom(W, X) and there is an 
outgoing morphism in some Hom(X, Y). With the kind of indexing used for a 
chain complex, a sequence 


Mye, Myn-1en-1 
oS Xn+1 mest Xn pairs Xn-1 ——s ee 


is exact if m, = kere,_1, or equivalently if e,_1 = coker my, for all n. 
For a sequence of four morphisms of the form 


m e 


0 > K >A >C > 0, 


exactness means exactness at K, A, and C. The conditions are that m is a 
monomorphism, e is an epimorphism, and m = kere (or equivalently that e = 
coker m). In this case the sequence is called a short exact sequence. 

One can now proceed to define projectives and injectives for any abelian 
category as certain objects in the same way as in Figures 4.3 and 4.4, and extend 
all the results of earlier sections of this chapter to all abelian categories. We shall 
not carry out this detail.'8 

Instead, we shall indicate an approach to carrying out this detail that takes most 
of the difficulty out of translating results from the context of good categories to 
the context of abelian categories. It is to use the notion of “members.” The 
word “members” in the present setting refers to something that substitutes for 
elements in situations in which objects need not necessarily be sets of elements. 
The idea is to recast elements, when they exist, in terms of morphisms and then 
to generalize the resulting definition. For orientation, consider the category Cr 
of all unital left R modules, R being a ring with identity. Let us write Ro for 
the left R module R. The elements of a unital left R module X are then in 
one-one correspondence with the R homomorphisms of Ro into X, the element 
x corresponding to the homomorphism that carries r to rx. Thus the category 
Cr has a distinguished object Ro such that the elements of any object X are in 
one-one correspondence with Hom(Ro, X). Hence any argument about elements 
for this category immediately translates into an argument about morphisms. 

The trouble is that a general abelian category has no distinguished object to play 
the role of Ro. The idea for getting around this difficulty is to take all possible 


'8The entire theory for abelian categories is carried out in detail in Freyd’s book Abelian Cate- 
gories: An Introduction to the Theory of Functors. 


242 IV. Homological Algebra 


objects Xo in place of Ro, consider the union on Xo of all sets Hom(Xo, X), 
introduce an equivalence relation, and hope for the best. 

The definition is as follows. Let C be an abelian category, fix X in Obj(C), 
and consider all morphisms with codomain X. Two such morphisms x and y are 
said to be equivalent morphisms for current purposes, written x = y, if there 
exist epimorphisms u and v such that xu = yv. It is evident that “equivalent” 
is reflexive and symmetric. Transitivity requires proof, and we return to this 
matter in a moment. Once = has been shown to be an equivalence relation, an 
equivalence class of such morphisms is called a member of X. We write x €,, X 
to indicate that x is a morphism with codomain X, hence to indicate that x is a 
morphism whose equivalence class is a member of X. To avoid clumsy wording 
when there is really no possibility of confusion, we often simply say that x is 
a member of X. The question arises whether this definition presents any set- 
theoretic difficulties. As usual in category theory, one can answer the question 
painlessly by working when necessary only with subcategories for which the 
objects actually form a set; in this case, the union over all objects X and Y in the 
subcategory of all the groups Hom(X, Y) of morphisms is a set, and there is no 
problem. Let us return to a special case of our example. 


EXAMPLE OF MEMBERS. Let C = Cz be the category of all abelian groups, and 
fix an abelian group X. If x is an abelian-group homomorphism with codomain 
X, let us use Proposition 4.38 to write x = me for a monomorphism m and 
an epimorphism e. Then x = m, and thus we might just as well consider only 
one-one homomorphisms into X. If H is the image of x, then we can view 
xX as a composition x = iyy of a homomorphism y carrying the domain of x 
onto H, followed by the inclusion iq : H — X. The homomorphism y is an 
isomorphism, hence is an epimorphism. Thus x = iy. It is apparent that no 
two inclusions of subgroups of X into X are equivalent morphisms. Since every 
inclusion of a subgroup of X into X yields a member of X, the members of 
X are exactly the subgroups of X. Thus for example the set of members of Z 
corresponds to the set of integers > 0, in which addition is lost, and does not 
correspond exactly to the set of elements of Z. This fact is a little discouraging, 
but it turns out not to be as bad an omen as one might expect. 


Returning to the setting of a general abelian category, we work toward a proof 
that = is an equivalence relation. We need the notion of the “pullback” of two 
morphisms, which we define by a universal mapping property momentarily. The 
appropriate construction establishing existence appears in the next proposition. 
Then we prove a proposition for using pullback as a tool, and afterward we prove 
the transitivity. 

In an abelian category C, let X, Y, Z be objects, and let f ¢ Hom(Y, Z) 
and g € Hom(X, Z) be morphisms. A pullback of the pair (f, g) is a triple 


8. Abelian Categories 243 


(W, Ys 2) in which W is an object in C, in which a and @ are morphisms with 
f € Hom(W, Y) and g € Hom(W, X ), and in which the following universal 
mapping property holds: whenever (W’, f’, 2’) isatriple such that W’ is an object 
in Cand f’ and g’ are morphisms with f’ ¢ Hom(W’, Y) and g’ ¢ Hom(W’, X) 
and with fg’ = gf’, then there exists a unique g € Hom(W’, W) such that 
f' = fev and 2 = gq. See Figure 4.8. 


FIGURE 4.8. The pullback of a pair (f, g) of morphisms. 


Proposition 4.39. In an abelian category C, let X, Y, Z be objects, and let 
f € Hom(X, Z) and g € Hom(Y, Z) be morphisms. Let X @ Y be the direct 
sum, let py and py be the projections on the two factors, define h = fpx — gpy 


in Hom(X @ Y, Z), and let m = kerh. Then a pullback (W, f, 2) of (f, g) is 
given by W = domainm, f = pym, and g = pxm. 


REMARKS. The dual statement asserts the existence of a pushout of a pair 
of morphisms, and it is a consequence of Proposition 4.39. Problem 35 at the 
end of the chapter points out that the proof of Proposition 4.19a made use of a 
concretely constructed pullback, while the proof of Proposition 4.19b made use 
of a concretely constructed pushout. 


PROOF. From hm = hkerh = 0, we obtain0 = fpxym — gpym = fg — gf, 
and thus fg = gf. Now suppose that W’, f’, and g’ are given with fg’ = 
gf’. Then m’ = (g, f’) is a morphism in Hom(W’, X @ Y) such that hm! = 
fpxm' — gpym' = fg’ — gf’ =0. Therefore m’ factors through m = ker h as 
(g’, f’) = mg for a unique g € Hom(W’, W). Application of px and py to this 
equality gives g’ = pymy = gy and f' = pymg = fg. 


Proposition 4.40. In the notation of Figure 4.8 and Proposition 4.39 if f is a 
monomorphism, then so is f. If f is an epimorphism, then so is f; in the case 
of an epimorphism, ker f factors as ker f = g(ker f ). 


PROOF. Throughout the proof let ix and iy be the injections associated with the 
direct sum X @ Y. Suppose that f is a monomorphism, and suppose that fw = 0 
for some morphism with codomain W. Since f = pym, pymw = 0. Then 
0 = (fpx — gpy)mw = fpxmw —0= fpxmw. Since f is a monomorphism, 
pxmw = 0. Sincealso fw = pymw =0,mw = (ix px tiy py)mw = 0. Butm 
is amonomorphism, and therefore w = 0. Consequently f is a monomorphism. 

For the remainder of the proof, assume that f is an epimorphism. Let us 


244 IV. Homological Algebra 


see that h = fpy — gpy is an epimorphism. In fact, if zh = 0, then 0 = 
z(fpx — gpy)ix = zfpxix = zf. Since f is an epimorphism, z = 0. Thus / is 
an epimorphism. 

It follows from Lemma 4.37 that h = coker(kerh) = cokerm. To prove that 
fi is an epimorphism, suppose that v f= = 0 for some morphism v with domain 
Y. This means that vpym = 0. Since h is the cokernel of m, vpy factors as 
vpy = vh for some morphism v’. Applying ix on the right end of both sides 
gives 0 = vpyix = v/hixy = u'(fpx — gpy)ix = v' fpxix = v'f. Since f is 
an epimorphism, v' = 0. Hence upy = v'h = 0. Since py is an epimorphism, 
v = 0. Therefore f is an epimorphism. 

Now set k = ker f, and let K be its domain. The morphisms k €« Hom(K, X) 
and 0 <« Hom(K, Y) have fk = 0 = g0. If we set W’ = K, f’ = 0, and g’ =k, 
then f 2’ = gf", and Proposition 4.39 produces a unique gy in Hom(K, W) with 
0 = fg andk = gg. We shall show that ¢ is a kernel of f, and then the equation 
k = gy completes the proof. 

We know that fe gy = 0. Thus suppose that fo = 0 for some morphism v in 
some Hom(K’, W). Since fg = gf, we have fgu = g fv =0. Thus gv factors 
through k = ker f as gv = kv’ for some v’ in Hom(K’, K). 

Put ® = v— gv’. Then f® = fu — fov' =0-0=40, and gd = 
gu — gyv’' = kv’ — kv’ = 0. Consequently if we put W” = K’, f’= = 0, and 

ge" = a then ® and 0 are two morphisms in Hom(K', W) with f’= = fo= fo 
and ¥ 2 = 30 = fi 0. By uniqueness of the morphism in the universal mapping 
property for pullbacks, ® = 0. Therefore v = gv’, and v has been exhibited as 
factoring through 9g. 

If v factors through g also as v = yu”, then 0 = g(v’ — v”), and we have 
k(v' — v") = gg(v' — v"”) = 0. Since k = ker f is a monomorphism, v! = v". 
Thus the factorization of v through ¢ is unique, and ¢ is a kernel of f. This 
completes the proof. 


Proposition 4.41. Let C be an abelian category, let X be an object in C, 
and define x = y for two morphisms x and y with codomain X if there exist 
epimorphisms u and v with xu = yv. Then the relation = on the morphisms 
with codomain X is transitive and hence is an equivalence relation. 


REMARK. A nontrivial special case is that the obvious equivalences xu = x 
and x = xv imply the nonobvious equivalence xu = xv when u and v are 
epimorphisms. 


PRooF. Assuming that x = y and y = z, write xu = yv and yr = zs 
for epimorphisms u,v,7r,s. Since v and r have the same codomain, namely 
domain(y), the pullback (v,7) of (v,r) as in Proposition 4.39 is well defined, 
and Proposition 4.40 shows that 0 and F are epimorphisms. Since rv = ur, we 


8. Abelian Categories 245 


obtain xuF = yur = yrv = zsv. The morphisms u7 and s¥ are epimorphisms 
as compositions of epimorphisms, and therefore x = z. 


Fix an object X. Then 09x is amember of X called the zero member, denoted 
by 0. Every zero morphism Oy with codomain X is equivalent to 0x; in fact, 
Oyy = 00x0yo. The morphism Oyo is an epimorphism because if f ¢ Hom(0, Z) 
has fOyo = Oyz, then f is the unique element 0oz of Hom(0, Z). Conversely 
any nonzero morphismr in Hom(Y, X) is inequivalent to Oy x. In fact, an equality 
ru = Oyxvu for epimorphisms u and v would imply that r = Oyx, since we can 
cancel in the equality ru = Oyxv = Oyxu. 

Each x €,, X has a “negative,” namely the class of the negative of the repre- 
sentative x of the member; i.e., taking the negative of a morphism is respected 
in passing to classes. We write —x €,, X for the negative. (Warning: As 
the example with the category of abelian groups shows, one should use care in 
inferring any relationship between “negatives” and zero members.) 

If f is a morphism in Hom(X, Y), then each member x €,, X yields by 
composition a well-defined member fx €,, Y. To see that this notion is indeed 
well defined, suppose that x = x’, and choose epimorphisms u and v with 
xu =x'v. Then (fx)u = f(xu) = f(x'v) = (fx')v shows that fx = fx’. 

The main result is Theorem 4.42 below, which gives a calculus for diagram 
chases using members in general abelian categories. After the proof we shall be 
content with one example of how the theorem allows all the diagram chases in 
earlier sections of this chapter to be extended to general abelian categories. The 
example is the proof of the part of the Snake Lemma that involves an explicit 
construction.!? More examples appear in Problems 34—35 at the end of the 
chapter. 


Theorem 4.42. The members of an abelian category satisfy the following 
properties: 
(a) a morphism f € Hom(X, Y) is a monomorphism if and only if every 
X Ey X with fx =O has x = 0, 
(b) amorphism f € Hom(X, Y) isa monomorphism if and only if every pair 
of members x €,, X and x’ €, X with fx = fx’ hasx =x’, 
(c) a morphism g € Hom(X, Y) is an epimorphism if and only if for each 
y Em Y, there exists some x €,, X with gx = y, 
(d) a morphism h € Hom(X, Y) is the 0 morphism if and only if every 
xX En X hashx =0, 
(e) a sequence X Sept. 74s exact at Y if and only if gf = 0 and also 
each y €,, Y with gy =0 has some x €,, X with fx = y, 


'°For more detail about this example and for further examples, see Mac Lane’s Categories for 
the Working Mathematician. 


246 IV. Homological Algebra 


(f) whenever x, y, z are members of an object X and x = yu + zv for some 
epimorphisms u and v, then xu’ — yv’ = z for some epimorphisms wu’ 
and v’. 


REMARKS. 

(1) The interpretations of (a) through (e) are straightforward enough and 
already give an indication that the notion of a member may be of some help 
in translating proofs for good categories into proofs for abelian categories. Ap- 
plication of (d) to the difference f; — fo of two morphisms in Hom(X, Y) shows 
that fix = fox forall x €,, X implies fi = fo. 

(2) The interpretation of (f) is more subtle. As the example with the Snake 
Lemma below will show, conclusion (f) makes it possible to mirror in the theory 
of members the kind of subtraction that takes place with elements of a module to 
get their difference to be in the kernel of some homomorphism. 


PROOF. For (a) and (b), if f isa monomorphism and fx = fx’, then fxu = 
fx’'v for suitable epimorphisms u and v, and cancellation yields xu = x’v and 
hence x = x’. Conversely suppose fx = 0 only for x = 0. If f has fx’ = Oy 
for some x’ in some Hom(A, X), then fx’ = 0 and so x’ = 0 by hypothesis. In 
this case, x’ = 04x because we know that nonzero morphisms are not equivalent 
to 0. 

For (c), suppose that g is an epimorphism. If y ¢€,, Y is given, let y be 
in Hom(X’, Y), and let (@, ¥) be the pullback of (g, y), satisfying yg = gy. 
Proposition 4.40 shows that 2 is an epimorphism, and then y = gx for x = J. 
Conversely if g fails to be an epimorphism, then there exists r 4 O in some 
Hom(Y, Z) withrg = Oyz. If there is some x insome Hom(A, X) with gx = ly, 
we can compose with r on the left of both sides and obtainrgx =rly =r. Since 
the left side equals 04z, which is equivalent to Oyz, we obtain Oyz = O4z =r, 
which we know not to be true for nonzero members r of Hom(Y, Z). 

For (d), if h = Oxy and if x is in Hom(Z, X), then hx = Oxyyx = Ozy = Ooy. 
Conversely if every x in every Hom(Z, X) has hx = Ooy, we take Z = X and 
x = 1y. Then hu = hxu = Ooyv for some epimorphisms u € Hom(A, X) and 
v € Hom(A, 0). This says that hu = Oay = Oxyu. Since u is an epimorphism, 
h= Oxy. 

For (e), let f = me be the decomposition of f as in Proposition 4.38. Then 
m = image f, and we define k = kerg. If the sequence is exact at Y, then 
gf = 0as part of the definition. Suppose y €,, Y has gy = 0,1e., gy = 0. Since 
m = ker g by exactness, the equality gy = 0 and the definition of kernel together 
imply that y = my’ for some y’. Using Proposition 4.39, let (e, y’) have (é, y’) 
as pullback, satisfying ey’ = y’é. Since e by construction is an epimorphism, 
Proposition 4.40 shows that @ is an epimorphism. From the computation fy’ = 
mey' = my’é = ye, we obtain fy’ = y. Then x = y’ has x €», X and fx = y. 


8. Abelian Categories 247 


Conversely suppose that gf = O and that the other condition holds. Since e 
is an epimorphism, the equality gf = 0 implies that gm = 0. The definition of 
k = ker g thus gives m = kg for some morphism gy. Meanwhile, the morphism 
k = kerg hask €, Y and gk = 0. Thus gk = 0. The hypothesis produces 
x Em X with fx = k,i.e., with mexu = kv for suitable epimorphisms u and 
v. Write ex = m’e’ according to Proposition 4.38. Then mm’e'u = kv, and the 
uniqueness in Proposition 4.38 shows that k = mm'y for some isomorphism w. 
Putting the results together gives m = kg = mm'wg andk = mm'y = kom'w. 
Since m and k are monomorphisms, | = m’wq and 1 = gm'w. These show that 
g has a left inverse and a right inverse, hence is an isomorphism. Then m’ too is 
an isomorphism, and k = m except for a factor of an isomorphism on the right 
side. This means that we can take ker g = image f and that the given sequence 
is exact at Y. 

For (f), letx = yu+zv. Then xu; = (yu+zv)vy, and xu; — y(uv,) = zvvy. 
Consequently xu, — y(uvj) = zvvy = z, and (f) follows with u’ = wu, and 
v = Uvy. 


Theorem 4.42 enables us to use members to verify properties of morphisms in 
diagrams, but it does not by itself construct any morphisms. That is, just because 
we know what the equivalence class of fx should be for every x €,, X does not 
mean that we have a construction of f; it means only that we know how to work 
with f once f is known to exist. Specifically we know from Remark 1 with 
the theorem that there cannot be a different morphism g with fx = gx for all 
X €,, X. Some tools that we have for constructing morphisms for a general abelian 
category are the existence of kernels and cokernels via Axiom (iv), Proposition 
4.39 asserting the existence of pullbacks of pairs of morphisms, and the dual of 
Proposition 4.39 asserting the existence of pushouts of pairs of morphisms. For 
particular categories of interest, the hypotheses “enough projectives” and “enough 
injectives” provide additional constructions of morphisms. 

The most complicated example of a constructed mapping that we encountered 
in the theory for good categories was the connecting homomorphism in the Snake 
Lemma. In the generalization to abelian categories, the construction of the 
connecting morphism has to go outside the usual diagram given in Figure 4.2. 
Problem 33 at the end of the chapter will compare the actual construction and 
Figure 4.2 for the chain map of exact sequences of abelian groups given below 
and observe that the two diagrams are different: 


x8 1h 1mod8 
0 > Z 5 > 0/82 —— 0 
cA x2 1mod8 
t>2mod4 
x4 1h 1mod4 


0 > Z > Z 5 Z/4Z ——> 0 


248 IV. Homological Algebra 


The domain of the connecting homomorphism for this situation is the set of even 
members of Z/8Z, and the mapping carries 2 + 8Z to 1 + 4Z in Z/4Z. 


EXAMPLE OF DIAGRAM CHASE. In the setting of the Snake Lemma (Lemma 
4.6), we shall construct the connecting morphism q and verify that its value on 
each member of its domain corresponds to what we expect on the basis of Lemma 
4.6. The given snake diagram, partially enlarged toward Figure 4.2, is 


Co 


Ants p83 Ge > 0 

| | | 

Ng ib he (*) 
6S Be 


with the rows exact and the squares commuting. The added parts at the top 
and bottom are the kernel (Co, k) of y and the cokernel (Aj, p) of a. Once 
the connecting homomorphism has been constructed, the proof of exactness will 
involve a diagram chase that makes rather straightforward use of Theorem 4.42, 
including conclusion (f). By contrast, the initial construction will involve a 
different sort of diagram, namely 


Bo fp tes > Co 
a 1 | 
eT: k 
- i af: 
ee | > Bos. > 0 
f jp f 
Qo — > A’ vy Bl i egy paseues > 0 


8. Abelian Categories 249 


In the construction we adjust the first row of (+) to make it exact when a 
0 is included at the left end. To do so, we factor g according to Proposition 
4,38 as g = me, we let A = domainm = codomaine, and we write @ for 
m. The commutativity of the left square of (*) implies that g’a(kerg) = 
Be(kergy) = 0. Since g’ is a monomorphism, a(kergy) = 0. Then the fact 
e = coker(ker g) implies that w factors through e as a = we for some @ with 
domain A. Consequently the left square in the adjusted diagram commutes, and 
the first row is exact with the 0 inserted at the left. Since e is an epimorphism, 
p =cokera = coker(@e) = coker @, and the vertical line at the left is exact. 

By a dual argument starting from a factorization of w’, we can replace the 
triple (C’, w’, y) in similar fashion by (C’, w’, 7), see that k = ker, and add 
a 0 at the end of the second row to obtain an exact sequence. 

Next, let (Bo, Vv, k) be a pullback of (y, k). Proposition 4.40 shows that v 
is an epimorphism and that ker wy = k ker WV. Since the first row is a short exact 
sequence, we know that @ = ker w, and the condition ker = k ker v shows 
that @ = ker w satisfies @ = kg~. This completes the dashed arrows in the top 
part of the diagram. By a dual argument using p = cokera@, we complete the 
dashed arrows in the bottom part of the diagram, deducing from Vv = coker ¢’ 
the fact that y’ = coker @’ satisfies W = W'D 
_, Lemma 4.37 shows from @ = ker that v = coker @, and it shows from 
yw’ = coker g’ that g’ = ker’. With these formulas in hand, we can construct 
the connecting homomorphism. Define w = pBk in Hom(Bo, Bj) to be the 
composition down the center. Then ag = PBk@ = Q pa = 0, the last 
equality holding because pa = 0. Therefore wo factors through 72 = coker @ @g as 
wo = a for some w; € Hom(Co, Bj). The morphism @ satisfies y oy = — 
w'DBk = vkw = = 0, the last equality holding because yk = 0. Since v 1S 
an epimorphism, we can cancel it, obtaining a @, = 0. Therefore w, factors 
through g’ = ker wy as w; = Y’w for some morphism w € Hom(Co, Ap). 

The construction of w is now complete, and the assertion is that the value of w 
on members corresponds to what we expect from the proof of Lemma 4.6. Since 
equivalences wx = wx for some other candidate w’ for the connecting morphism 
and for all x €,, Co would imply that w = w’, the argument will show that we 
have found the unique morphism taking the prescribed values on members. 


During the verification we refer to (*) to do the diagram chase. The member of 
C corresponding to x €,, Co is kx €» C. Since w is an epimorphism, Theorem 
4.42c produces b €,, B with wb = kx. Then ’Bb = ywb = ykx = 0, since 
yk = 0. Theorem 4.42e and exactness at B’ imply that g’a’ = Bb for some 
a’ € A’, and the class of a’ is unique (for the b under consideration) by Theorem 
4.42b because g’ is a monomorphism. We shall verify that wx = pa’, and then 
the class of wx matches what we expect from the proof of Lemma 4.6. 


250 IV. Homological Algebra 


First let us show that a different choice of b, say b,, leads to the same class 
pa’. We are given that yb = yb. Let a’ and aj be the corresponding members 
of A’ with ga’ = Bb and g’a, = Bb. We shall make repeated use of Theorem 
4.42f, letting subscripted u’s and v’s denote suitable epimorphisms. From wb = 
wh,, Theorem 4.42f gives ybu; — whiv, = 0, ie. w(bu; — biv1) = 0. By 
Theorem 4.42e and exactness at B, bu, — b\v, = ga for some a €,, A. Hence 
Bou, — Bbiv; = Bea = ¢’aa. Two applications of Theorem 4.42f starting from 
Bbu, — Bb,v, = g'aa give 


ga’ = Bb = ¢'aau2 + Bb, v2, 
and then g'a'u3 — g'aav3 = Bb; = ¢'a). 
Since g’ is a monomorphism, Theorem 4.42b says that 
a’'u3 — daVv3 = a). 
Applying p, we obtain pa’u3 — paav3 = pa,. Since pa = 0, we can drop the 
term pav3, and we conclude that pa’ = pa'u3 = Pay. 

We can now return to the verification that ax = pa’, making use of the adjusted 
diagram as necessary. 20 Since v is an epimorphism, Theorem 4.42c produces 
bo Em Bo with bho = x. Then kbp €, B has wkbo = kb = = kx. Hence kbo 
is a member of B like b and }, in the Previous paragraph: The above sreumen 


shows that Bkbo €,, B’ has Bkby = ga’ for some a’ €,, A’ and that pa’ €,, Ap 
is what we should hope for as the mane of wx. So we compute that 


Pox = wx =O Wb = woby = PPkby = Py'a’ =F pa’. 


Since g’ is a monomorphism by the dual of Proposition 4.40, Theorem 4.42b 
shows that wx = g’a’, which is the formula we were seeking. 


9, Problems 


1. (a) Prove that the good category of all finitely generated abelian groups has 
enough projectives but not enough injectives. 
(b) Prove that the good category of all torsion abelian groups has enough injec- 
tives but not enough projectives. 

2. LetCz be the category of all abelian groups. Give an example of a nonzero good 
category C of abelian groups that has enough projectives and enough injectives 
but for which no nonzero projective for Cz lies in C and no nonzero injective for 
C lies in Cz. 


20 Warning: The construction of w involves Bo and Bj, which are in the adjusted diagram but 
are not in (*). These objects do not necessarily coincide with the domain of ker 6 and the codomain 
of coker f. 


9. Problems 251 


Let R be a semisimple ring in the sense of Chapter II, and let Cr be the category 
of all unital left R modules. Prove that every module in Cr is projective and 
injective. 


Let R be a (commutative) principal ideal domain, and let Cr be the category of 

all unital R modules. A module M in Cp is divisible if for each a 4 0 in R and 

x € M, there exists y € M withay = x. 

(a) Referring to Example 2 of injectives in Section 4, prove that injective for Cr 
implies divisible. 

(b) Deduce from Proposition 4.15 that divisible implies injective for Cr. 


Let R be a (commutative) principal ideal domain, and let Cr be the category of all 
unital R modules. Prove that every module M in Cz has an injective resolution 
of the form 0 — M > Ip > I, > 0 with Jp and J injective. 


Let C, C’, C” be good categories of modules with enough projectives and enough 

injectives, let G : C — C’ bea one-sided exact functor with derived functors G, 

or G", and let F : C’ > C” be an exact functor. 

(a) Prove that if F is covariant, then F o G is one-sided exact, and its derived 
functors satisfy (F o G), = F 0 Gy, or (F 0G)” = F oG". 

(b) Prove that if F is contravariant, then F oG is one-sided exact, and its derived 
functors satisfy (F o G)” = F 0 G, or (F 0G)” = F 0G". 


Let C, C’, C” be good categories of modules with enough projectives and enough 
injectives, let F : C — C’ be an exact functor, and let G : C’ > C” bea 
one-sided exact functor with derived functors G, or G”. 

(a) Suppose that F is covariant, that G, or G” is defined from projective res- 
olutions, and that F carries projectives to projectives. Prove that G o F is 
one-sided exact and that its derived functors satisfy (G o F), = Gy o F or 
(GoF)"=G" oF. 

(b) Suppose that F is covariant, that G, or G” is defined from injective res- 
olutions, and that F carries injectives to injectives. Prove that G o F is 
one-sided exact and that its derived functors satisfy (G o F), = Gy o F or 
(GoF)"=G" oF. 

(c) Suppose that F is contravariant, that G, or G” is defined from projective 
resolutions, and that F carries injectives to projectives. Prove that G o F is 
one-sided exact and that its derived functors satisfy (G o F)” = G” o F or 
(GoF), =G,oF. 

(d) Suppose that F is contravariant, that G, or G” is defined from injective 
resolutions, and that F carries projectives to injectives. Prove that G o F is 
one-sided exact and that its derived functors satisfy (G o F)” = G" o F or 
(GoF), =G,oF. 


252 IV. Homological Algebra 


8. Let G be a group, and let F = (F* — Z) be a free resolution of the trivial 
ZG module Z in the category ZG. If M is an abelian group on which G acts by 
automorphisms, then we know that the cohomology H”(G, M) is defined to be 
the n" cohomology of the cochain complex Homzg(F*+, M) and the homology 
H,,(G, M) is defined to be the n" homology of the chain complex F+ @zg M. 
Take for granted the result of Proposition 3.32 that if G is a finite cyclic group 
with generator s, then 


BG fp ey je ee ee Ly fp RRR, gay AN | 


isa free resolution of ZG, where T and N are the left ZG module homomorphisms 
defined by 


T = multiplication by (s) — (1), 
N = multiplication by (1) + (s) +---+ (Ciao F 


Prove that H"(G, M) = H"+?(G, M) and H,(G, M) = Hy+2(G, M) for all 
n > 1 and all M when G is a finite cyclic group. 


Problems 9—11 concern changes of rings. Fix ahomomorphism p : R — S of rings 
with identity. This homomorphism determines three functors of interest, denoted by 
ce : Cs > Cr, PR : Cr — Cg, and 3 : Cr — Cg. The first takes an S module M 
and makes it into an R module Fe (M) by the definition rm = p(r)m forr € R and 
m € M; the effect on an S homomorphism is to leave the function unchanged and to 
regard it as an R homomorphism; this functor is manifestly exact. For the second, 
regard S as an (S, R) bimodule with right R action given by sr = sp(r), and define 
P3(M) = S @pr M for M in Obj(Cp) and P3(~) = 1s ® ¢ for g in Homar(M, N); 
this functor is covariant and right exact. For the third, regard S as an (R, S) bimodule 
with left R action given by rs = p(r)s, and define 13(M) = Homr(S, M) for M in 
Obj(Cr) and 13) = Hom(ls5, ¢) for g in Homr(M, N); this functor is covariant 
and left exact. 


9. IfCand D are good categories of modules and if F :C — DandG : D > Care 
covariant additive functors such that there exist isomorphisms of abelian groups 


Hom(F (A), B) = Hom(A, G(B)) 


natural for A in Obj(C ) and for B in Obj(D), then F is said to be left adjoint to 

G and G is said to be right adjoint to F. 

(a) Prove that if G carries onto maps in D to onto maps in C, then F carries 
projectives in C to projectives in D. 

(b) Prove that if F carries one-one maps in C to one-one maps in D, then G 
carries injectives in D to injectives in C. (Educational note: The conclusions 
in this problem extend to any abelian categories C and D, and in this enlarged 
setting, (b) follows from (a) by duality.) 


10. (a) 
(b) 


(c) 


(d 


wm 


(e) 


11. (a) 
(b) 
(c) 
(d) 


(e) 


9. Problems 253 


Prove that Pe is left adjoint to Fe ; 

Deduce from the previous problem that P3 sends projectives in Cr to pro- 
jectives in Cs. 

Prove that if the right R module S is projective, then Ps is exact. (Ed- 
ucational note: In the subject of Lie algebra homology and cohomology, 
this hypothesis is satisfied when S is the universal enveloping algebra of a 
Lie algebra g over a field K, R is the universal enveloping algebra of a Lie 
subalgebra h of g, and p : R — S is the inclusion. It is satisfied also in the 
subject of homology and cohomology of groups if S is the group algebra KG 
of a group G over a field K and if R is the group algebra KH of a subgroup 
H. See Problem 13c below.) 

Using Problem 7, prove that if the right R module S is projective, then 
Exte(Ps M,N) = Ext’, (M, Fe N) naturally in each variable (M being in 
Obj(Cr) and N being in Obj(Cs)). 

Even without the assumption that the right R module S$ is projective, let 
X = (Xt + M) bea projective resolution of a module M in Cr, and let 
Y=(y¥t> PRM) be a projective resolution of P3M in Cs. Construct a 
chain map from Pe X to Y extending the identity map on Ps M, and use it to 
obtain the associated homomorphism Exti(P3 M,N) > Ext’, (M, FEN ) 
natural in each variable. 


Prove that 7 2 is right adjoint to Fe : 

Deduce from Problem 9 that / : sends injectives in Cr to injectives in Cg. 
Prove that if the right R module S is projective, then [ : is exact. 

Using Problem 7, prove that if the right R module S is projective, then 
Ext(M Jl 3 N) = Ext (FEM , N) naturally in each variable (M being in 
Obj(Cs) and N being in Obj(Cpr)). 

Even without the assumption that the right R module S is projective, let 
X = (Xt = N) be an injective resolution of a module N in Cr, and let 
Y=(¥t > IRN) be an injective resolution of IRN in Cs. Construct a 
chain map from Y to J : N extending the identity map on [ . N, and use it 
to obtain the associated homomorphism Ext (M,I aN \—> Ext’, (Fe M,N) 
natural in each variable. 


Problems 12-13 concern the effect on cohomology of groups of changing the group. 
The main result is the exactness of the “inflation-restriction sequence”; this is applied 
particularly in algebraic number theory to relate Brauer groups (see Chapter III) for 
different field extensions. Let J and K be groups, and let p : J — K bea group 
homomorphism. By the universal mapping property of group rings, p extends to 
a ring homomorphism, also denoted by p, from ZJ into ZK. For any group G, 
we make use of the standard free resolution F(G) = (F(G)t us Z) of Z in the 
category Czg, as described before Theorem 3.20. A Z basis of F,(G) consists of 


254 IV. Homological Algebra 


all tuples (go,..., Zn), and a ZG basis consists of those members of the Z basis 
with go = 1. In the context of the groups J and K, any ZK module M becomes 
a ZJ module by the formula xm = p(x)m for x € ZJ and m é€ M. In particular, 
each free ZK module F,(K) can be regarded as a ZJ module. Meanwhile, the 
homomorphism p : J — K induces a function from the ZJ basis of F,,(J) into 
F,(K) by the formula p(1, ji, +5 jin) = (Ls PCs «+++ 0Gin)) f0r jy ees dn € J 
and this extends to a ZJ homomorphism, still called p, of F,(J) into F,(K). A 
look at the formula for the boundary operators 07 and dx in Section III.5 shows 
that p is a chain map in the sense that xo = pd,. If M is any unital left ZK 
module, then it follows that Hom(p, 1) : Hom(F'(K), M) > Hom(F (J), M) is a 
cochain map. Consequently we get maps on cohomology for each n of the form 
H"(p): H"(K, M) > H"(J, M). There are two cases of special interest: 


Gi) If oe: H > Gis the inclusion of a subgroup into a group, then the mapping 
on cohomology is called the restriction homomorphism 


Res: H"(G, M) — H"(H, M). 


(ii) If H is a normal subgroup of G, let p : G — G/H be the quotient 
homomorphism. For any ZG module M, let M" be the subgroup of H 
invariants. Then G/H acts on M”. The above construction is applicable 
to the module M” for the group ring Z(G/H) of G/H, and we form the 
mapping on cohomology H"(G/H, M") + H"(G, M"),. The inclusion of 
the ZG module M” in M induces a mapping H"(G, M"”) — H"(G, M), 
and the composition is called the inflation homomorphism 


Inf : H"(G/H, M") > H"(G, M). 


When H is anormal subgroup of G and M is a ZG module and q > 1 is an integer 
such that H*(H, M) = 0 for 1 < k < q —1, the inflation-restriction sequence is 
the sequence of abelian groups and homomorphisms 


0 —> H1(G/H, M4) *% H4(G, M) 8S H4(H, M). 


12. Forg = 1, use direct arguments to prove the exactness of the inflation-restriction 

sequence by carrying out the following steps: 

(a) By sorting out the isomorphism ®, : Homzg(Fy,M) — C4(G, M) of 
Section III.5, show that the effect of a homomorphism p : G > G’ on 
C1(G', M) is given by (p* f)(g1,.--, &q) = f(0(81), -- +s P(8q))- 

(b) Verify that Res o Inf = 0 by looking at cocycles. 

(c) Show that Inf is one-one on H4(G/H, M 4) by showing that any cocycle 
f :G/H — M® that is a coboundary when viewed as a function on G is 
itself a coboundary for G/H. 


13. 


(d) 


9. Problems 255 


Show that every member of ker(Res) lies in image(Inf) by showing that any 
cocycle f : G + M whose restriction to H is acoboundary may be adjusted 
to be 0 on A and that an examination of the equation f(st) = f(s) +sf(t) 
in this case shows f to be a cocycle of G/H with values in M”. 


Assume inductively that g > 1, that H* (H, M) = 0 for 1 <k <q -—1, and that 
the inflation-restriction sequence is exact for all N for degree g — 1 whenever 
H*(H, N) = Oforl <k <q-—1. FormB = 17° FZ, M = Homz(ZG, M) as 
in Problems 9-11. Elements of B can be identified with functions g on G with 
values in M, and G acts by (go~)(g) = 9(ggo). 


(a) 


(b) 


(c) 


(d 


wm 


(e) 


(f) 


(g) 
(h) 


(i) 


For m € M, show that the function ¢,, (t) = tm is a one-one ZG homomor- 
phism of M into B. If N = B/M, then the sequence 0 — M > B — 
N -— Ois therefore exact in Czg. 

Use Problem 11 to verify that H kG. By = Extt, (Z, FagM ), and deduce 
that H«(G, B) =0 fork > 1. 

Verify the equality of right ZH modules ZG = A @z ZH for some free 
abelian group A. 

Using (c), show that ee B = Homz(ZH, Homz(A, M)), and deduce that 
H*(H, B) =0 fork > 1. 

Using the hypothesis that H'!(H, M) = 0 and a long exact sequence asso- 
ciated to the short exact sequence in (a), show that0 — M HW _,. BH _, 
N# - 0 is exact. 

Prove that Z ®zy ZG = Z(G/H) as right ZG modules, where Z(G/H) is 
the integral group ring of G/H. 

Show that BY! = 177’) M, and deduce that H‘(G/H, B”) =O fork > 1. 
Using the long exact sequences for G and for H associated to the short exact 
sequence of (a), as well as the long exact sequence for G/#H associated to 
the short exact sequence of (e), establish isomorphisms of abelian groups 


H9-!(G/H, N”) = H4(G/H, M"), 
H4~'(G, N) = H4(G, M), 
H¢"!(H, N) = H4(H, M). 


Set up the diagram 
0 ——» H4-!(G/H, N") ——> H47!(G,N) ——> H47'(H,N) 


{ i 1 


0 —— A4(G/H,M") ——» H4(G,M) ——> H%(H,M) 


show that it is commutative, and deduce from the foregoing that the 
inflation-restriction sequence is exact for M in degree q. (Educational note: 


256 IV. Homological Algebra 


For an application to Brauer groups, let F C K C L be fields, and assume 
that K/F, L/F, and L/K are all finite Galois extensions. The groups in 
question are G = Gal(L/F), H = Gal(L/K), and G/H = Gal(K/F), and 
the modules in question are M = L* and M? = K™. The index q is to 
be 2, and the vanishing of H! is by Hilbert’s Theorem 90. The conclusion 
is that the sequence 0 > B(K/F) > B(L/F) — B(L/K) is exact.) 


Problems 14—16 introduce the cup product in the cohomology of groups. This is a 
construction having applications to topology and algebraic number theory. Let G 
be a group, and form the standard free resolution F = (F* ais Z) of Z in the 
category Czg, as described before Theorem 3.20. A Z basis of F,, consists of all 
tuples (go, ..., 8n), and a ZG basis consists of those members of the Z basis with 
go = 1. Let 0 denote the boundary operator, with the subscript dropped that indicates 
the degree. Define Gp. : Foig > Fp ®@z Fa by 


Yp,q (805 +++» Spt+q) = (805-+-, Sp) @ (Kp, -- +5 Bq): 


14. Check that (¢ ® €) o go,o = € and that each gpg with p > 0 andg > OisaZG 
homomorphism satisfying 


Qp.qg 09 = (0 @1) 0 @ptig + (-1)?(1 @d)o Pp,q+l- 


15. If A and B are abelian groups on which G acts by automorphisms, show that G 
acts by automorphisms on A @z B in such a way that g(a ® b) = ga ® gb for 
alla € A,b € B, g € G. Thus whenever A and B are unital left ZG modules, 
then so is A @z B. 


16. For any unital left ZG module M, we work with Homzg (Fy, M) as the space of 
n-cochains. (Here it is not necessary to unravel the isomorphism given in Section 
IIL.5 that relates HomzcG(F;,, M) to the space C”(G, M) of cochains defined in 
Chapter VII of Basic Algebra.) Define the coboundary operator on the complex 
Homzg(Ft, M) to be d = Hom(d, 1). For any unital left ZG modules A and 
B, let f ¢ Hom(F), A) and g € Hom(F,, B) be given. The product cochain 
f - gis the member of Homzc(Fpig, A @z B) given by f-g = (f @g8) o@p gq. 
(a) Check that f - ¢ = (df)-g+(-1)? f « (dg). 

(b) How does it follow that this product descends to a homomorphism of abelian 
groups a @ b +» a Ub carrying the space H?(G, A) ®z H14(G, B) to 
H?*t4(G, A @z B)? The descended mapping is called the cup product. 

(c) Explain why the cup product is functorial in each variable A and B. 

(d) Explain why the cup product for p = 0 and g = 0 may be identified with 
the mapping on invariants given by A° @ BS —> (A @z B)®. 


Problems 17—20 introduce flat R modules, R being aring with identity. These modules 
are of interest in topology and algebraic geometry. Let R° be the opposite ring of 
R; right R modules may be identified with left R° modules. Let Cr be the category 


9. Problems 257 


of all unital left R modules; tensor product over R can be regarded as a functor in 
the second variable, carrying Cr to Cz, or as a functor in the first variable, carrying 
Cro to Cz. A unital right R module M (i.e., a unital left R° module) is called flat if 
M ®p(-) is an exact functor from Cr to Cz. Since this functor is anyway right exact, 
M is flat if and only if tensoring with M carries one-one maps to one-one maps, i.e., 
if and only if whenever f : A — B is one-one, thenly @f:M@rA> M@RB 
is one-one. Take as known the analog for the functor Tor of all the facts about Ext 
proved in Section 7. 


17. 


18. 


19. 


20. 


Prove for unital right R modules that 

(a) the right R module R is flat, 

(b) adirect sum F = Dyes F, is flat if and only if each Fy is flat, 
(c) any projective in Cpe is flat. 


Let M be a unital right R module. For each finite subset F of M, let Mr be the 
right R submodule of M generated by the members of F’. Prove that M is flat if 
and only if each Mf is flat. 


Let B be in Cr, write B as the R homomorphic image of a free left R module F, 
and form the exact sequence 0 — K — F — B — Oinwhich K is the kernel 
of F — B. Prove for each unital right R module A that the sequence 


0 > Tor#(A, B) > A@r K > A@R F > ASRB>0 


is exact. Deduce that A is flat if and only if Tor} (A, B) = 0 for all B. 


Suppose that R is a (commutative) principal ideal domain, so that in particular 
R = R°. The torsion submodule 7 (M) of a module M in Cr consists of all 
m € M withrm = 0 for somer 4 0 in R. 

(a) Suppose that M is of the form M = F @ T(M), where F is a free R 
module. Using the exact sequence 0 — F — M — T(M) — 0, prove 
that Tor} (M, B) = Tort (T(M), B) for all modules B in Cr. 

(b) Deduce from (a) and Problem 18 that a module M in Cp is flat if and only 
if T(M) is flat. (Note that M is not assumed to be of the form F @ T(M).) 

(c) By comparing the one-one inclusion (a) C R for a nonzero a € R with the 
induced map from (a) @r M to R @pr M, prove that T(M) 4 0 implies M 
not flat. 

(d) Deduce that a module M in Cp is flat if and only if M has 0 torsion, i.e., if 
and only if M is torsion free. (Educational note: In combination with the 
result of Problem 19, this condition explains the use of the notation “Tor” 
for the first derived functor of tensor product.) 


Problems 21-25 deal with double chain complexes of abelian groups. A double 
chain complex is a system {£y,,} of abelian groups defined for all integers p and q 
and having boundary homomorphisms 4), : Ep,q > Ep—1,q and 07: Ep,q > Ep,q-1 


258 IV. Homological Algebra 


such that 07, _ 1,g9p.q = 0, OF ge ee = 0, and 0), 4 Ong Bee ipa = 0. This set 


of problems will assume that E,,, = 0 if either p or g i sifeiently: mebatie. 

21. Let {Ep,q}beadouble complex of abelian groups with boundary homomorphisms 
as above, let E, = Dicgen Ey,q, and define 0, : En — En—1 by Onli, = 
oe gor an. qg: Show that the maps 0, make the system {£,,} into a chain complex. 
(Note: The indexing on the boundary maps has been changed by 1 from earlier in 
the chapter in order to simplify the notation that occurs later in these problems.) 


22. Let C; be a good category of unital left R modules, and let C, be a good category 
of unital left R° modules; the latter modules are to be regarded as unital right 
R modules. Let C = {Cp}p>—co and D = {Dg}q>-co be chain complexes 
with boundary maps a, : Cp > Cp; in, and By : Dg > Dg- in G. It is 
assumed that C, = 0 for p sufficiently negative and that D, = 0 for g sufficiently 
negative. Define Ey, = Cp @r Dy, ong = a, @ 1, and ce = (-1)?(1 @ By). 
Prove that {Ey} with these mappings is a double complex of abelian groups. 
(Educational note: Therefore the previous problem creates a chain complex 
{E,} with boundary maps 0, : E, — E,—, from this set of data. One writes 
E =C @k D for this chain complex and calls it the tensor product of the two 
chain complexes.) 


23. In the notation of the previous problem, suppose that Cp, = 0 if p < O and 
Dg = 0 if g < 0. Let Zp = kera, and a = ker By. Prove that if c is in Zp 
and d is in Z,, then c @d is in the subgroup ker(0,, , +, ,) of Ep,q and that as 
a consequence, there is a canonical homomorphism of H?(C) ®r H4(D) into 
H?*4(C @p D). 


24. Suppose that a double complex Ey, of abelian groups has Epg = 0 if p < —lor 
q < —lor p=q =—1. Suppose further that E.\, is exact for each g = 0 and 
Ey,. is exact for each p > 0. Prove that the r™ homology of E_\,q as q varies 
matches the r‘ homology of E p,—-1 a8 p varies. To do so, start from a cycle a 
under 0” in E_; x with k > 0. It is mapped to 0 by 9’, hence has a preimage a’ 
under 0’ in Eo,x. The element 0a’ in Ep, is mapped to 0 by 0’, hence has a 
preimage a” in E, ,—1. Continue in this way, and arrive at a cycle in Ex,9. Then 
sort out the details. 


25. With notation as in Problem 22, let A be in C,, and let B be in C;. Let C = 
(C* — A) bea projective resolution of A, and let D = (D* — B) bea 
projective resolution of B. Form E = C @p D as in Problem 22, and apply 
Problem 24 to give a direct proof (without the machinery of Section 7) that one 
gets the same result for Tor* (A, B) by using a projective resolution in the first 
variable as by using a projective resolution in the second variable. 


Problems 26-31 concern the Kiinneth Theorem for homology and the Universal 
Coefficient Theorem for homology. Both these results have applications to topology. 


9. Problems 259 


It will be assumed throughout that R is a (commutative) principal ideal domain. 


STATEMENT OF KUNNETH THEOREM. Let C and D be chain complexes 
over the principal ideal domain R, and assume that all modules in negative 
degrees are 0 and that C is flat. Then there is a natural short exact sequence 


an 


00> Ba (Hp(C) QR H,(D)) —> A, (C @r D) 


p+q=n F 
—“S @® Tork(H,(C), Hq(D)) > 0. 


p+q=n—1 
Moreover, the exact sequence splits, but not naturally. 


The point of the theorem is to give circumstances under which the homology of 
each of two chain complexes C and D determines the homology of the tensor product 
E = C@rD, the tensor product complex being defined as in Problem 22. Problem 26 
below shows that some further hypothesis is needed beyond the limitation on R. A 
sufficient condition is that one of C and D, say C, be flat in the sense that all 
the modules in it satisfy the condition of flatness defined in Problems 17—20. The 
problems in the set carry out some of the steps in proving the Ktinneth Theorem, and 
then they derive the Universal Coefficient Theorem for homology as a consequence. 
To keep the ideas in focus, the problems will suppress certain isomorphisms, writing 
them as equalities. 
26. With R = Z, letC = D be the chain complex with Co = Z/2Z and with C, = 0 
for p # 0. Let C’ be the chain complex with Cj = Z, with C; = Z, and with 
C;, = 0 for p > 1 and for p < 0. Let the boundary map from C’ to Co be 
x2. Compute the homology of C, C’, D, C ®z D, and C’ ®z D, and justify the 
conclusion that the homology of each of two chain complexes does not determine 
the homology of their tensor product. 


27. Let 0’ be the boundary map for C. Show how to set up an exact sequence 
Cae Ss P20 


of complexes in which each module in Z is the submodule of cycles of the 
corresponding module in C, z is the inclusion, B is the complex of boundaries, 
and B’ is B with its indices shifted by 1. Why does it follow from the fact that 
C is flat that Z, B, and B’ are flat? 


28. Explain why 


0-3 Hepp CeeD SP 6p 


is exact even though D is not assumed to be flat. 


260 


29. 


30. 


IV. Homological Algebra 


The long exact sequence in homology corresponding to the short exact sequence 
in the previous problem has segments of the form 


Hy4\(B! @p D) —"> H,(Z @p D) 22"> H,(C ®p D) 


a, @l j @n-1 
pao ee A, (B @rR D) ar Ay -\(Z @r D). 

Let 9” be the boundary map for D, and let Z, B, and B be the counterparts for 

D of the complexes Z, B, and B’ for C. Show that 

(a) the boundary map in B’ @z D may be regarded as | ® 0” because the 
boundary map in B’ is 0. 

(b) ker(1 @ 8”), = (B’ @p Z)n and image(1 @ 8’)n41 = (B! @pr B)y because 
B’ is flat. 

(c) Ay,(B’ @r D) = (B @r H(D))n_-1 because B’ is flat. (This isomorphism 
will be treated as an equality below.) 

(d) similarly H,(Z®pD) = (Z@r A(D))n. (This isomorphism will be treated 
as an equality below.) 


Form an exact sequence 
0— B—- Z— A(C) — 0 


of complexes, form the low-degree part of the long exact sequence corresponding 
to applying the functor (-) ®@r H(D), namely 


0 > Tor} (H(C), H(D))n > (B @r H(D))n 
—> (Z @r H(D))n > (H(C) @r H(D))n > 9, 
and rewrite it by (c) and (d) of Problem 29 as 


, 


0 > Tor®(H(C), H(D))n “> Hn41(B! @e D) 


"3 H,(Z @p D) —> (H(C) @p H(D))n > 0. 
(a) Why is the term Tor? (Z, H(D)) in the long exact sequence equal to 0? 
(b) Inthe 5-term exact sequence of Problem 29, rewrite the part of the sequence 
centered at the map 0), ® 1 in such a way that two exact sequences 
_n®". H.(C @p D) —-> coker(t, @ 1) ——> 0 
and 
0 ——> ker wn) ——> H,(B' @r D) => Hy_-1(Z @p D) 
result. Why can the group ker @,— and the homomorphism i be taken to be 
Tor’ (H(C), H(D))n—1 and B’_,? 
(c) Why in (b) can coker(, ® 1) and g be taken to be Tor*(H(C), A(D))n-1 
and some one-one homomorphism ,,_ such that 6) _,B,—1 = 0) @ 1? 


31. 


9. Problems 261 


(d) Arguing similarly with the map /, © | in Problem 29, obtain a factorization 
tn @ 1 = apa, in which a}, : (Z @r H(D))n > (A(C) @r H(D))y is onto 
and a, : (H(C) @r H(D))n — Hn(C @pr D) is one-one. 

(e) The maps a, and £,—1 having now been defined in the sequence in the 
statement of the Kiinneth Theorem, prove that the sequence is exact. 

(Universal Coefficient Theorem) By specializing D in the statement of the 


Kiinneth Formula to a chain complex that is a module M in dimension 0 and is 0 
in all other dimensions, obtain the natural short exact sequence 


0 —> H,(C) @r M — H,(C @r M) — Tor’ (Hy_1(C), M) — 0, 


valid whenever R is a principal ideal domain and C is a chain complex whose 
modules are all 0 in dimension < 0. (Educational note: The exact sequence 
splits, but not naturally.) 


Problems 32-35 concern abelian categories. 


32. 


333 


34. 


39) 


Let C be an abelian category. Let D be the category for which Obj(D) consists of 
all chain complexes of objects and morphisms in C and for which Morph(X, Y) 
for any two objects X and Y in D consists of all chain maps from X to Y. Prove 
that D is an abelian category. 


Consider the snake diagram in the category of all abelian groups consisting of the 
four rightmost groups in the first row and the four leftmost groups in the second 
row of the following commutative diagram: 


0 Z x8 1+ 1mod8 Z/8Z 0 


1mod8 
[x4 [x +>2mod4 
x4 1h 1mod4 
OSS 3-7 2 ee PA =p 


Adjoin the 0’s to make the diagram become what is displayed. Following the 
steps in the example of a diagram chase in Section 8, extend this diagram to the 
auxiliary diagram that appears in that discussion, and show that (Bo, k) for the 
extended diagram is not a kernel of 6. 


For a general abelian category C and any M in Obj(C), verify that Hom(- , M) 
is a left exact contravariant functor from C to Cz and Hom(M, -) is a left exact 
covariant functor from C to Cz. 


Proposition 4.19 shows for any good category C of unital left R modules that a 
module P in C is projective for C if and only if Hom(P, - ) is an exact functor, 
if and only if every short exact sequence0 > X — Y > P — O splits. 
Rewrite this proof in such a way that it applies to arbitrary abelian categories 
C. For the step in the argument that the splitting of every short exact sequence 
0 — X — Y > P = Oimplies that P is projective, use the notion of pullback 
that is developed in Section 8. 


CHAPTER V 


Three Theorems in Algebraic Number Theory 


Abstract. This chapter establishes some essential foundational results in the subject of algebraic 
number theory beyond what was already in Basic Algebra. 

Section | puts matters in perspective by examining what was proved in Chapter I for quadratic 
number fields and picking out questions that need to be addressed before one can hope to develop a 
comparable theory for number fields of degree greater than 2. 

Sections 2—4 concern the field discriminant of a number field. Section 2 contains the definition of 
discriminant, as well as some formulas and examples. The main result of Section 3 is the Dedekind 
Discriminant Theorem. This concerns how prime ideals (p) in Z split when extended to the ideal 
(p)R in the ring of integers R of a number field. The theorem says that ramification, i.e, the 
occurrence of some prime ideal factor in R to a power greater than 1, occurs if and only if p divides 
the field discriminant. The theorem is proved only in a very useful special case, the general case 
being deferred to Chapter VI. The useful special case is obtained as a consequence of Kummer’s 
criterion, which relates the factorization modulo p of irreducible monic polynomials in Z[X ] to the 
question of the splitting of the ideal (p)R. Section 4 gives a number of examples of the theory for 
number fields of degree 3. 

Section 5 establishes the Dirichlet Unit Theorem, which describes the group of units in the ring 
of algebraic integers in a number field. The torsion subgroup is the subgroup of roots of unity, and 
it is finite. The quotient of the group of units by the torsion subgroup is a free abelian group of a 
certain finite rank. The proof is an application of the Minkowski Lattice-Point Theorem. 

Section 6 concerns class numbers of algebraic number fields. Two nonzero ideals J and J in the 
ring of algebraic integers of a number field are equivalent if there are nonzero principal ideals (a) 
and (b) with (a)/ = (b)J. Itis relatively easy to prove that the set of equivalence classes has a group 
structure and that the order of this group, which is called the class number, is finite. The class number 
is | if and only if the ring is a principal ideal domain. One wants to be able to compute class numbers, 
and this easy proof of finiteness of class numbers is not helpful toward this end. Instead, one applies 
the Minkowski Lattice-Point Theorem a second time, obtaining a second proof of finiteness, one that 
has a sharp estimate for a finite set of ideals that need to be tested for equivalence. Some examples 
are provided. A by-product of the sharp estimate is Minkowski’s theorem that the field discriminant 
of any number field other than Q is greater than 1. In combination with the Dedekind Discriminant 
Theorem, this result shows that there always exist ramified primes over Q. 


1. Setting 


It is worth stepping back from the results of Chapter I to put matters into perspec- 
tive. Chapter I studied three problems, all of which could be stated in terms of 


262 


1. Setting 263 


elementary number theory. These were the questions of solvability of quadratic 
congruences, of representability of integers or rational numbers by primitive 
binary quadratic forms, and of the infinitude of primes in arithmetic progressions. 

We had started from the more general problem of studying Diophantine equa- 
tions, beginning with the observation that solvability in integers implies solvabil- 
ity modulo each prime.! Linear congruences being no problem, we began with 
quadratic congruences and were led to quadratic reciprocity. Then we sought 
to apply quadratic reciprocity to address representability of integers or rational 
numbers by binary quadratic forms. The reasons for studying the infinitude of 
primes in arithmetic progressions were more subtle; what we saw was that at 
various stages in dealing with binary quadratic forms, this question of infinitude 
kept arising, along with techniques that might be helpful in addressing it. 

Work on at least the first two of the problems was helped to some extent by the 
use of algebraic integers, and we shall see momentarily that algebraic integers 
illuminate work on the third problem as well. In any event, it is apparent where 
to look for a natural generalization. We are to study higher-degree congruences, 
perhaps in more than one variable, and we are to use algebraic extensions of the 
rationals of degree greater than 2 to help in the study. 

The situation studied in Section IX.17 of Basic Algebra will be general enough 
for now. Thus let F (X) be a monic irreducible polynomial in Z[X]. Section IX.17 
began to look at the question of how F(X) reduces modulo each prime p. We 
begin by reviewing the case of degree 2, the main results in this case having been 
obtained in Chapter I in the present volume. For the polynomial F(X) = X?—m 
with m € Z, the assumed irreducibility means that m is not the square of an 
integer. For fixed m and most primes p, either F(X) remains irreducible modulo 
p or F(X) splits as the product of two distinct linear factors. The exceptional 
primes have the property that F(X) modulo p is the square of a linear factor; 
these are the prime divisors of m and sometimes the prime 2. In short, they occur 
among the prime divisors of the discriminant 4m of F(X). In terms of quadratic 
residues, the irreducibility of F(X) modulo p means that m is not a quadratic 
residue modulo p, and the splitting into two distinct linear factors means that it 
is. The odd primes for which F(X) modulo p is the square of a linear factor are 
the odd primes that divide m. Modulo 2, every integer is a square, and reduction 
modulo 2 was not helpful. 

The number theory of quadratic number fields sheds additional light on this 
factorization. The relevant field is of course Q(./m ); this is a nontrivial extension 
of Q, since m is not square. In working with this field in Chapter I, we imposed 
the additional condition that m be square free. Promising a general definition for 


'Solvability modulo each prime power is also of interest but played a role in Chapter I only for 
powers of 2. 


264 V. Three Theorems in Algebraic Number Theory 


later, we defined the field discriminant of Q(./m ) in that chapter to be 


. 4m if m = 2 mod 4 orm = 3 mod 4, 
| m ifm = 1 mod 4. 


Problems 20—24 in Chapter I implicitly related the splitting of F(X) modulo p 
to the factorization of ideals. Let R be the ring of algebraic integers in Q(./m ). 
If p is an odd prime, those problems observed that (p)R is a prime ideal in R if 
D is a nonsquare modulo p, is the product of two distinct prime ideals if D is a 
square modulo p but is not divisible by p, and is the square of a prime ideal if D 
is divisible by p. The factorization of (2)R was more subtle and was addressed 
in Problem 21. 

In any event, the pattern of reducibility modulo p of X* — m, at least when 
the prime p is odd, mirrors the pattern of factorization of the ideal generated 
by p in the ring of algebraic integers in the number field Q(./m). The role 
of quadratic reciprocity was to explain this pattern. Problem 1 at the end of 
Chapter I showed that one qualitative consequence of quadratic reciprocity is that 
the odd primes p for which X? — m remains irreducible are the ones in certain 
arithmetic progressions, and similarly for the odd primes not dividing p for which 
a factorization into two linear factors occurs. 

One objective of a generalization is to produce a corresponding theory for an 
arbitrary monic irreducible polynomial F(X) in Z[X], say of degree n. Let K be 
the extension of Q generated by a root of F(X), and let R be the ring of algebraic 
integers in K. Theorem 9.60 of Basic Algebra shows for each prime number p 
that the decomposition of the ideal (p)R in R as a product of powers of distinct 
prime ideals takes the form (p)R = os P* with f; = [R/P; : Z/(p)] and 
>, ef; =n. Meanwhile, F(X) factors modulo p as a product of powers of 
irreducible polynomials modulo p. Sections 2—3 will describe a theory begun 
by Kummer and Dedekind for how the factorization of the ideal (p)R and the 
factorization of the polynomial F(X) modulo p are related. One introduces a field 
discriminant for K that is closely related to the discriminant of the polynomial 
F(X), and a key result, the Dedekind Discriminant Theorem, says that some e; 
is > 1 if and only if p divides the field discriminant. The primes p for which 
some é; is greater than | are said to ramify in the extension field K. These primes 
are not as well behaved as the others, and one’s first inclination might be to try 
to ignore them. However, Problems 25—40 at the end of Chapter I show that the 
ramified primes encode a great deal of information; in particular, they explain the 
theory of genera and the relationship between exact representability of rational 
numbers and representability of integers modulo the field discriminant. 

Generalizations of quadratic reciprocity lie much deeper and are central results 
of the subject of class field theory, a subject that is beyond the scope of the present 
book. Suffice it to say that class field theory in its established form seeks to 


1. Setting 265 


parametrize all finite Galois extensions of any number field having abelian Galois 
group; the parametrization is to refer only to data within the given number field. 
The reciprocity theorem in this setting goes under the name “Artin reciprocity,” 
which includes quadratic reciprocity as a very special case. Class field theory 
for nonabelian finite Galois extensions is at present largely conjectural, and the 
conjectural reciprocity statement goes under the name “Langlands reciprocity.” 

Beginning in Section I.6, we translated some of the theory of binary quadratic 
forms into facts about quadratic number fields. One tool we needed was a de- 
scription of the units in the ring of algebraic integers within the quadratic number 
field. It is to be expected that a similar description for an arbitrary number field 
will play a foundational role in number theory beyond the quadratic case. The 
description in question is captured in the Dirichlet Unit Theorem, which appears 
as Theorem 5.13 in Section 5. 

The translation of the notion of proper equivalence class of binary quadratic 
forms into the language of quadratic field extensions led to a notion of strict 
equivalence of ideals, as well as a notion of ordinary equivalence. Because there 
are only finitely many proper equivalence classes of forms, there could be only 
finitely many strict equivalence classes of ideals, and this set of classes of ideals 
acquired the structure of a finite abelian group. Dirichlet studied the order of this 
group, which figures into formulas for the value of certain Dirichlet L functions 
L(s, x) at s = 1. The ideal class group for ordinary equivalence is a quotient of 
this group by a subgroup of order at most 2. 

Although we shall not be concerned with representability of integers by forms 
of degree greater than 2, the ideal class group and its order (the “class number” 
of the field) are of interest for general number fields when defined in terms of 
ordinary equivalence, not strict equivalence. Section 6 is devoted to proving that 
the class number is finite for any number field and to developing some tools 
for computing class numbers. Class number 1 is equivalent to having the ring 
of algebraic integers in question be a principal ideal domain. Apart from the 
appearance of class numbers in various limit formulas, here is one other indicator 
of the importance of the ideal class group: It is possible to extend the above theory 
of ramification in such a way that it applies to any extension K/F of number fields, 
not just to finite extensions of Q. Hilbert proved that for any F, there is a finite 
Galois extension K/F with abelian Galois group that is small enough for the 
extension to be unramified at every prime ideal of F and that is large enough for 
any unramified abelian extension of F to lie in K. Artin reciprocity can be used 
to show that Gal(K/F) is isomorphic to the ideal class group’ of F and thus gives 
some control over the nature of K. In particular, K = F if and only if every 
ideal in the ring of integers of F is principal. When F is quadratic over Q, the 


2The field K is called the Hilbert class field of F. The name “class field” is meant to be a 
reminder of this isomorphism. 


266 V. Three Theorems in Algebraic Number Theory 


field IKK can be used to give more definitive results than in Chapter I concerning 
representability of integers by binary quadratic forms. 


2. Discriminant 


Let us recall some material about Dedekind domains from Chapters VIII and IX 
of Basic Algebra. A Dedekind domain is a Noetherian integral domain that is 
integrally closed and has the property that every nonzero prime ideal is maximal. 
Any principal ideal domain is an example. Any Dedekind domain has unique 
factorization for its ideals. Theorem 8.54 of the book gave a construction for 
extending certain Dedekind domains to larger Dedekind domains: if D is a 
Dedekind domain with field of fractions F and if K is a finite separable extension 
of F, then the integral closure of D in K is a Dedekind domain R. The hard 
step in the proof, which was not carried out until Section [X.15, was to deduce 
from the separability that R is finitely generated over D. The role of separability 
was to force the bilinear form (a, b) +> Trxr(ab) to be nondegenerate, and this 
nondegeneracy in turn implied the desired result about finite generation. 

In this section we introduce a tool that captures this last implication in quan- 
titative fashion—that nondegeneracy of the trace form implies that the extended 
domain is finitely generated over the given domain. In a full-fledged treatment of 
algebraic number theory, one might well want to work in this full generality,* but 
we need less for our purposes: Throughout this section we assume that the given 
Dedekind domain is the ring Z of integers, that K is a number field, and that R is 
the integral closure of Z in K, i.e., R is the ring of algebraic integers within K. 
Let n = [K : Ql] be the degree of the field extension. Since C is algebraically 
closed, we can regard K as a subfield of C. 

The separability of K/Q in combination with the fact that C is algebraically 
closed implies that there exist exactly n distinct field maps 01, ..., 0, of K into 
C; one of them is the identity. Recall how 0), ..., 0, can be constructed: if & isa 
primitive element for K/Q, if F(X) is the minimal polynomial of € over Q, and 
if&; = &, &,...,&, are the n distinct roots of F(X) in C, then o; can be defined 
by o;( ae. ee )= bears cig | on any Q linear combination of powers of €. For 
any 7 = ae cié' in K, primitive or not, the n elements o;(7) of C are called 
the conjugates of 7 relative to K. They are the roots of the field polynomial of 7 
over K, and each occurs with multiplicity [K : Q(n)].* 


3For example this full level of generality would be appropriate if one planned ultimately to study 
class field theory. 

4The field polynomial of an element of K is the characteristic polynomial of left multiplication 
on K by the element. This notion is discussed in Section IX.15 of Basic Algebra. 


2. Discriminant 267 


Let [ = (vy,...,v,) be an ordered basis of K over Q. The symmetric 
bilinear form (u, v) +> Trx/g(uv) determines an n-by-n symmetric matrix Bj; = 
Trx/g(vjv;), and we can recover the form from the matrix B by the formula 
Trx/g(uv) = a' Bbifa = (3) and b = (A) are the column vectors of u and v in 
the ordered basis IT’, i.e., ifu = )°7_, a;v; and v = )¥_, bjv;. From Section VI.1 
of Basic Algebra, we know that the bilinear form determines a canonical Q linear 
map L from K to its vector space dual by the formula L(u)(v) = Trx/g(uv) and 
that the nondegeneracy of the form? implies that this linear map is one-one onto. 
Moreover, the matrix of L with respect to I’ and the dual basis of is B. Thus 
the nondegeneracy implies that the matrix B is nonsingular. The discriminant 
D(L) of the ordered basis I is given by 


D(V) =det B, where B is the matrix of (u, v) +> Trz/x (uv) in the basis I. 


Because of the nonsingularity of B, this is a nonzero member of Q. 

Proposition 6.1 of Basic Algebra shows the effect on the matrix B of changing 
the basis. Specifically let A = (w 1, ..., w,) be a second ordered basis, and let C 
be the matrix of the form in this basis, namely Cj; = Trx/g(w;w;). Let the two 


bases be related by w; = )77_, aijuj, Le., let [ajj] = ( a . Then the proposition 


c= (14) ®(s5): 


Taking determinants and using the fact that a matrix and its transpose have the 
same determinant, we obtain 


gives 


D(A) = Dir) («et (,/,)) 


One consequence of this formula is that the sign of D(I’) is independent of I. 
Another is that the value of D(I°) does not depend on the ordering of the n 
members of I; it depends only on I as an unordered set. 

Now suppose that the members of the ordered basis I" are in the subring R 
of algebraic integers within K. Bases of K over Q consisting of members of R 
always exist, since we can always multiply the members of a basis of K over Q by 
a suitable integer to get them to be in R. In this case the entries B;; = Trx/q(vjv;) 
of the matrix of the bilinear form are in Z, and D(I) is therefore a nonzero member 
of Z. 

The field discriminant, or absolute discriminant, of K, denoted by Dx, is 
the value of D(I’) that minimizes | D(T)| for all bases of K consisting of members 


5The nondegeneracy of the trace form for a number field is a transparent result, not requiring 
anything deep from Section IX.15 of Basic Algebra, since any u 4 0 in K has Trxg (uu!) = 
Trx/o() =n £0. 


268 V. Three Theorems in Algebraic Number Theory 


of R. This is a nonzero integer. The sign of Dx is well defined, since all values 
of D(T) have the same sign.° 


Fix an ordered basis = (v,..., v,) of K, and consider the abelian group 
consisting of the Z span Z(I°) of the members of I’. This is evidently a free 
abelian group of rank n. If an ordered basis A = (wj,..., W,) has the property 


that Z(A) C ZL), then the theory in Section IV.9 of Basic Algebra that leads 
to the Fundamental Theorem of Finitely Generated Abelian Groups shows that if 


we write formally 
Wi VI 


then there exist n-by-n integer matrices M; and M2 of determinant +1 such that 
D = M,C M2 is diagonal, and moreover the order of Z(I’)/Z(A) is | det D| = 


t 
| det C|. Examining the definition of C, we see that C = (a) . Consequently 
we obtain 


I 
IZ(P)/Z(A)| = | det ( 1.) | 

a formula we shall use repeatedly in this chapter without specific reference. 
Proposition 5.1. If T° is a basis of K over Q whose members all lie in R, 


then |R/Z(L)|? = D(I)/Dx. In particular, T is a Z basis of R if and only if 
D(L) = Dx. 


REMARKS. We already know from Basic Algebra that R is a free abelian 
group of rank n. The second conclusion of this proposition, in combination with 
the transparent observation that the trace form is nonsingular for a number field, 
gives a more direct proof of this fact. Introductory treatments of algebraic number 
theory sometimes give this more direct proof, whose details are spelled out in the 
second paragraph below. 


ProoF. Let A and Q2 be two bases of K over Q whose members all lie in R, 
and suppose that Z(A) C Z(Q). Then the above discussion shows that 


2 
I 
|D(A)| = |D()| (aet ( J), )) 
and that , 
2 I 
|Z(92)/Z(A)|* = (det (J.J) 
Since D(A) and D(&2) are nonzero and have the same sign, we obtain 


D(A)/D(Q) = |Z(Q)/ZA))’. (*) 


® As was observed above, any D(A) is the product of D(IT) and the square of a rational number. 
Hence D(A) and D(T) have the same sign. 


2. Discriminant 269 


To prove the proposition, we prove the “if” part of the second conclusion 
first— without using the known fact that R is free abelian. Choose A such that 
D(A) = Dx and such that A has all its members in R. Arguing by contradiction, 
suppose that A fails to be a Z basis of R. Let r be an element of R not in Z(A). 
Then the Z span of Z(A) U {r} is a finitely generated additive subgroup of IK and 
must be free abelian of rank > n. Being a subgroup of the additive group of K, 
it cannot have rank greater than n and hence has rank exactly n. Let Q be an 
ordered Z basis of this subgroup. Since Z(A) & Z(Q), the right side of (x) is 
> 1, and thus Dx > D(Q). But this is a contradiction because the members of 
Q lie in R, and hence A is a Z basis of R. In particular, a Z basis of R exists. 

To prove the rest of the proposition, take Q in (*) to be a Z basis of R, 
and let A = [I be any given basis of K over Q that lies in R. Then («) gives 
|R/Z(P)|? = D(T)/D(Q). Since |R/Z(L)| cannot be less than 1, | D(T)| cannot 
be less than |D(Q)|. Thus Dx = D(Q), and |R/Z(V)|? = D(P)/Dg. This 
proves the first conclusion of the proposition, and the “only if” part of the second 
conclusion is immediate. 


EXAMPLE. Field discriminant of a quadratic number field. Let K = Q(./m), 
where m is a square-free integer other than 1. From Section I.6 a Z ordered basis 
I of R is given by 


r ae if m = 2 or 3 mod 4, 
La vm —1)} ifm =1 mod 4. 


Proposition 5.1 allows us to compute Dx from this information. The matrix whose 


: : : : Qo oe 
determinant is Dx in the two cases is (; : ) and ( 


1 . 
0 2m —1 1@n-41) ), respectively, and 
2 
thus 


4m if m = 2 or 3 mod 4, 
De = 
m ifm = 1 mod 4. 


This is the formula that we took as a definition of field discriminant in Section 
16. 


For a general number field K of degree n over Q, there is no easy way to obtain 
a Z basis of R. Instead, one tries to compute Dx and find such a basis at the same 
time by successive refinements. 

The first step is to use the special kind of Q basis of K whose existence is 
guaranteed by the Theorem of the Primitive Element. Specifically one can write 
K = Qé&) for some é in K, since K/Q is a separable extension. Possibly after 
multiplying € by a suitably large integer, we may assume that € is in R. Then 
T(é) = {1,é,€7,...,&"7!} is a Q basis of K lying in R. We normally write 
D(&) instead of D(T(&)) for the discriminant of '(é). Write &; = o;(€) for the 


270 V. Three Theorems in Algebraic Number Theory 


i” conjugate of €. Let B = [B; ; | be the matrix whose determinant is D(&). Since 
the trace of an element is the sum of its conjugates, B;; is given by 


Bi = Tro '8-1) = Yo EEF) = tal, 
= k=l 
and this is of the form ese Vix Vis where V;, = po is an entry of a Vander- 
monde matrix. Therefore 
2 
D(&) = det B = (det V)* = (T] @ —&))° = T]1 @& - &)’. 
i<j i<j 


which coincides with the discriminant of the field polynomial of € over Q. 


EXAMPLES OF D(&). 

(1) K = Q(é), where €> — € — 1 = 0. This field was studied in Example | of 
Section IX.17 of Basic Algebra. The discriminant of the polynomial X° — X — 1 
is 2869 = 19-151, and thus D(€) = 2869. Proposition 5.1 shows that D(é) = 
Dxk? for some nonzero integer k. Since 2869 is square free, we conclude that 
Dx = 2869. 

(2) K = Q(V/2). The minimal polynomial of € = V2 is X> —2, and its roots 
are €, &w, and Ew, where w = e?”'/>, Then 


D(E) = & — §0)'( — a") (Ew — £0")? = 1 — 0)" — PY’, 
and this simplifies to D(é) = —273%. This quantity is the product of Dx by the 
square of an integer. Thus Dx is one of —3, —12, —27, and —108. 


What happens with Example 2 is typical: a second step is needed to decide 
among finitely many possibilities for Dx. In the general case an induction is 
involved, and Proposition 5.2 below says what is to be done at each step. At the 
end of this section, we shall return to Example 2 and use the proposition to see 
that Dx = —108 is the correct choice. 

Before stating Proposition 5.2, let us interpolate a generalization of the compu- 
tation of D(&) that preceded the above examples. Suppose that = (a ,..., dy) 
is any ordered Q basis of K lying in R. Let B = [B;;] be the matrix whose 
determinant is the discriminant of . Then we have 


n 


Bij = Trxja(aiaj) = DS ox(ajaj) = Yo ox (aj )ox(aj) = YO Aix (ADxj, 
pt 


where A = [A;;] is the matrix with Aj; = o;(q;), and it follows that 
2 
D(T) = (det[o;(«;)])”. 


This formula can be useful for computing D(I’) when the conjugates of the a; 
are readily available. 


2. Discriminant 271 


Proposition 5.2. Let [ = (v,,..., v,) be an ordered Q basis of K lying in 
R. If the Z span Z(1) of I is a proper subgroup of R, then there exists a prime 
number p such that p? divides D(I’) and such that some member 


vu, = p (civ + cov2 + +++ + CK—1UR—1 + UK) 


of K lies in R with 1 < k < nandO <c; < p—1for j < k—1. If such 
an element v; is found, then A = (v1,..., Ue—1, Uz, VetI,-+-, Un) has Z(A) 
properly containing Z([) with D(A) = p~*D(T). 


REMARKS. A finite computation is involved in finding p and k. On the one 
hand, for given p, at most 1+ p+ p?+---+p"~! elements have to be checked for 
integrality. On the other hand, we in principle have to find the field polynomial 
of a certain element of KK in each case and decide whether the coefficients are 
integers, and this computation may be lengthy. See Problem 2 at the end of the 
chapter for an easy example, Problem 16 for a harder example, and Problem 4b 
for a related computation. 


PRooF. Let Z(IT) be a proper subgroup of R, and put m = |R/Z(T)|. Choose 
a Z basis (w1,..., Wn) of R, and write v; = = cjjw; with all c;; € Z. We 
know that | det[c;;]| = m, and we let p be any prime divisor of m. Reducing the 
cij modulo p, we see that the matrix [c;;] is singular modulo p, and thus there 
exist integers a1,..., d, not all divisible by p such that 


n 
Yajicjj =Omodp forl<j <n. 
i=l 


Find k with 1 < k < n for which p divides all of ag44,...,@, but not a;, and 
write )>;_, ajcj; = pl; for integers /;. Then 


k nek n n 
Lay = LY aejywy =D (ply — Yo acy) wy, 
i=l j=li=l j=l i=k+1 


and the integer in parentheses on the right side is a multiple of p. Therefore 
r= ea ajv; is exhibited as ps for some s € R. Choose a’ and d; in Z with 
aa, — dp = 1, and choose c; and d; in Z for each i with i < k — 1 such that 
0<c; < p—1anda’a; — pd; = c;. Then the computation 


k k-1 k-1 k 
pa's=a'r=)) aay; = D0 (ci + pdj)uj +A +pd)ug = > cpu; turt+p Y— djv; 


i=1 i=1 i=1 i= 


shows that aad Oe cju; + vk) =a's— yy d;v; lies in R. 


272 V. Three Theorems in Algebraic Number Theory 


Proposition 5.1 shows that any primitive element & of K that lies in R has 
the property that D(€)/ Dx is the square of a nonzero integer, and we write this 
quotient as J(&)” with J(€) > 0. One might hope that although some particular 
choice of € fails to have J(€) = 1, some other choice may be found for which 
equality holds. We shall see in Section 4 that for a class of integers m, Q(</m ) 
has such an element & if and only if a certain nontrivial Diophantine equation in 
two variables has a solution. Both cases arise: for m = 2, such a é exists, while 
for m = 175, no such & exists. 

But matters can be worse than this for a general K. The quotient J(é)? = 
D(&)/Dx for a primitive element € of K lying in R is sometimes called the 
index of €. One might hope at least that each prime not dividing Dx fails to 
divide the index J(&)? for some £. However, Dedekind showed that there exist 
number fields K and primes p that are common index divisors’ in the sense that 
p divides J(&) for every primitive element € of K lying in R. Specifically he 
showed that p = 2 is such a prime when K is obtained by adjoining to Q a root 
of X37 + X? — 2X +8; here Dx = —503. We shall study this example further in 
Section 4. 

Let us now specialize our considerations from general additive subgroups of 
the form Z(T) to those that are ideals in R. 


Proposition 5.3. If J is a nonzero ideal in R, then 


(a) J contains a positive k in Z and 
(b) I additively is of the form J = Z(1) for some Q basis I. of K whose 
members lie in R. 


Consequently R/J is a finite ring and satisfies |R/I|? = D()/Dx. 


PROOF. Letr be anonzero member of J, and let P(X) be the field polynomial of 
r. Then P(X) is of the form P(X) = X"+a,_,X"~!4+-- -+a;X+(—1)"Nxo(r), 
has integers for coefficients, and has r as one of its roots. Consequently the 
formula 
(—D" NK) =r"! + anr" +++ +41) 


shows that the nonzero integer Nx/g(r) is the product of r by a member of R and 
hence lies in J. This proves (a) with k = |Nx/g(r)|. 

The ideal J additively is a subgroup of R and is thus free abelian of rank at 
most n. By (a), the integer k = |Nx,g(r)| has the property thatkR CIC R. 
Since R/KR has k” elements, R/J is finite. Therefore J has rank n as an additive 
group and must be of the asserted form Z(I). This proves (b). The formula 
|R/I|? = D(V)/Dyx is immediate from Proposition 5.1. 


7Terminology varies for this notion. Such primes p are more usually called common inessential 
discriminant divisors or essential discriminant divisors. The very fact that these two more usual 
names appear to contradict each other is sufficient reason to avoid using either name. 


2. Discriminant 273 


The absolute norm N (J) of a nonzero ideal J of R is defined to be N(J) = 
|R/I|. This is necessarily a positive integer by Proposition 5.3. To be able to 
work with this notion, we shall make use of the unique factorization of ideals of 
R as given in Theorem 8.55 of Basic Algebra. That theorem says that such an 
ideal J has a factorization of the form [es Pe , where the P; are distinct prime 
ideals of R, and that this factorization is unique except for the order of the factors. 


Proposition 5.4. The absolute norms of nonzero ideals of R have the following 
properties: 
(a) N(R) = 1. 
(b) If 7 C J are nonzero ideals in R, then N(J) divides N(J), and J = J if 
and only if N(J) = N(J). 
(c) If J and J are nonzero ideals in R, then NU J) = N(I)N(CJ). 
(d) If (@) is a nonzero principal ideal in R, then N((@)) = |Nxg(@)|. 


PROOF. Conclusion (a) is immediate, and so is most of (b). If 7 C J and 
N(J) = N(J), then the First Isomorphism Theorem for abelian groups yields 
(R/T) /(J/D) = R/J, and it follows that N(I)/|J/1| = N(J). Since N(J) and 
N(J) are finite, NJ) = N(J) if and only if |J/7| = 1, 1e., if and only if J = J. 

For (c), we begin with the special case that J and J are powers of a nonzero 
prime ideal P. Inductively it is enough to show that N (P*) = N(P)N(P*!) 
for k > 1. Since (RIES PT PS = R/P*~' as abelian groups, it is enough 
to show that 

[PS'/P*| = |R/P|. (*) 


The ring R operates on the ideal P*~!, carrying P* into itself, and P carries P*~! 
into Pk. Thus P‘~!/P* is a unital module for the ring R/P, which is a field 
because P is maximal. Hence P‘~!/P* is a vector space over R/P. Corollary 
8.60 of Basic Algebra shows that this vector space is 1-dimensional, and then () 
is immediate. 

For the general case in (c), Corollary 8.63 of Basic Algebra shows that if 
T= is a is the unique factorization of the nonzero ideal J as the product 
of positive powers of distinct prime ideals P;, then R/I = es R/ Be . Hence 


NW) = ines N e ). Because of the special case that is already proved, N(/) = 


14 N(P;)“. Then (c) follows in the general case. 

For (d), if [ = (w1,...,u,) is an ordered Z basis of R, then the tuple 
al = (au,,...,Q@u,) is an ordered Z basis of (a), and we know that N((@)) = 
[R/(o)| = |Z(P)/Z(eP)| = | det ( ao) . But (Cas is just the matrix of the 
Q linear map left-by-a in the Q basis I’, and the determinant of this linear map 
is Nx/g(@) by definition of the norm of an element. 


274 V. Three Theorems in Algebraic Number Theory 


EXAMPLE 2 OF D(é), CONTINUED. For K = Q(</2), we have seen that 
the discriminant of the K basis P(/2) is D(\/2) = —332?. We are going 
to show that (1, /2, J/4) is a Z basis of R, and then it follows that the field 
discriminant of K is Dx = —3°27. We apply Proposition 5.2. The only primes 
that need testing in that proposition are the ones dividing D(/2), and thus 
we consider p = 2 and p = 3. We want to see that no expression p~'!(1) 
or p!(cy + J/2) or po(q + 67/2 + </4) is an algebraic integer for some 
coefficients cp and c, between 0 and p — 1. We can discard p~'(1) because the 
only rational numbers that are algebraic integers are the members of Z. If the 
field polynomial over Q of some & in K is X? + a)X* + a, X + ao, then the 
field polynomial of p~'é is X? + p~!a)X? + p~*a,X + p~?ap. So the question 
of integrality is one of divisibility of the coefficients of the field polynomials of 
certain algebraic integers € by suitable powers of p. These coefficients, up to sign, 
are the values of the elementary symmetric polynomials on the three conjugates 
of &. 

In the case at hand, only the coefficient ao is needed. That is, it is enough to 
see that the norm of & is never divisible by 8 or 27 for € equal to c; + V2 or 
cy to V2 + V4 as above. Let us write —€=c; +00 +0367 witho = J/2 and 
with c), C2, c3 in Z. Then ag = —Nx/Q(§), and the norm is the product of the 
three conjugates of &. If @ = e?”'/3, we compute that 


Ngo) = (cy + 20 +0307) (Cc) + Om + €307w")(c) + 200" + 0307) 
/Q 


= (cf + 2c} + 4c3) + 2c1¢2¢3(20 + 3a” + w*) 


3 3 3 
= (cj + 2c5 + 4c3) — 6c) ¢2¢3. 


For p = 2, we consider this expression when c,, C2, c3 are chosen from {0, 1}. 
To get divisibility by 8, we check this expression modulo 8. Each c} is c; for 
c; € {0, 1}. Looking at the expression modulo 2, we see that c,; must be even, 
i.e., Cc; = 0. Then 8 must divide 2c3 + 4c3, and we obtain cp = c3 = 0, in 
contradiction to the formulas for the €’s under consideration. 

For p = 3, it is enough to consider this expression when c1, C2, c3 are chosen 
from {—1,0, +1}. Since each c; has |c;| < 1, we see that |Nxg(&)| < 13, 
and divisibility by 27 can occur only if Nx/g(€) = 0, which we know entails 
€ = 0. Thus no é meets the test of Proposition 5.2, and the conclusion is that 


(1, /3, V4) is a Z basis of R in Q(/2). 


3. Dedekind Discriminant Theorem 


The field discriminant plays a role in determining how a prime ideal (p) in Z, 
p being a prime number, splits when one extends (p) to an ideal (p)R in the 
ring R of algebraic integers in a number field K of degree n over Q. In this 


3. Dedekind Discriminant Theorem 275 


situation, recall from Theorem 9.60 of Basic Algebra that the prime factorization 
of the ideal (p)R in R is of the form (p)R = []#_, P with )°4_, e; f; =n; here 
n = [K : QI, the P; are distinct, and f; = dimp, (R/P;). The integers e; are 
called ramification indices, and the integers jf; are called residue class degrees. 
The extension K/Q is said to be ramified at p, and the prime p of Z is said to 
ramify in K, if some e; is > | in this decomposition.® 


Theorem 5.5 (Dedekind Discriminant Theorem). The prime p of Z ramifies 
in a number field K if and only if p divides the field discriminant Dx of K. 


In this chapter we shall prove this theorem only in a useful special case, namely 
in the case that p is not a common index divisor. Only finitely many primes can 
divide the index J(€) = (D(€)/Dx)!/? fora single primitive element é of K lying 
in R, and thus there are only finitely many common index divisors.” Consequently 
the special case that we are proving implies that only finitely many primes of Z 
ramify in K. 

The difficulty in proving Theorem 5.5 in full generality is that we lack sufficient 
tools for addressing questions by localization. At the end of this section, we shall 
make some comments about how one can proceed with further tools. 

As we shall see later in this section, Theorem 5.5 for primes that are not 
common index divisors is an easy consequence of the following theorem. 


Theorem 5.6 (Kummer’s criterion). Let K be a number field, and let R be its 
ring of algebraic integers. Suppose that F'(X) is a monic irreducible polynomial 
in Z[X], that € is a root of F(X) in C, and that p is a prime number that does 
not divide the integer J(€) such that J(€)* = D(E)/Dx. Write F(X) for the 
reduction of F(X) modulo p, let 


F(X) = F,(X)*!--- F,(X)* 


be the unique factorization of F(X) inF pLX] into a product of powers of distinct 
irreducible monic polynomials, and let f; = deg(F;). For eachi with 1 <i < g, 
select a monic polynomial F;(X) in Z[X] whose reduction modulo p is F;(X), 
and let P; be the ideal in R defined by 


P, = pR+ FER. 


Then the P;’s are distinct prime ideals of R with dimr,, (R/P;) = f;, and the 
unique factorization of (p)R into prime ideals is 
(p)R = Pf! .-- P,*. 

8More generally “relative discriminants,” which we have not defined, play a role in the splitting 
of prime ideals in passing from a general number field to a finite extension. The cited Theorem 9.60 
applies in this more general situation as well. This more general topic will be discussed further in 
Problems 5-9 at the end of this chapter and very briefly in Chapter VI. 

°Tn fact, it can be shown that every common index divisor is less than [K : Q]. 


276 V. Three Theorems in Algebraic Number Theory 


REMARKS. The additive group Z(T'(&)) generated by the powers of € through 
é"—! is a ring, since €” is an integral combination of the lower powers of &, and 
this ring has index J (&) as a subring of R. We divide the proof into two parts. The 
first part will give a complete proof in the special case that the subring Z(T'(&)) is 
all of R, but we shall retain notation that distinguishes the subring from the whole 
ring in order to see how much of the proof works for the general case. After the 
first part we pause for a lemma that will be used to tie results for the subring to 
results for all of R, and then we return to apply the lemma and complete the proof 
of Theorem 5.6. 


FIRST PART OF PROOF. Let P! be the ideal pZ[X] + F;(X)Z[X] in Z[X]. The 
passage from Z[X] to the quotient Z[X]/P/ can be achieved in two steps, first 
using the substitution homomorphism carrying Z to F, and X to itself and then 
taking the quotient by the principal ideal (F;(X)). Since F;(X) is irreducible in 
F,[X], the quotient is a field and P’ has to be prime. The number of elements in 
Z(X]/P! is p/i because deg(F;(X)) = f;. The ideals P; are distinct because the 
polynomials F;(X) are distinct. 

Meanwhile, the substitution homomorphism of Z[X] leaving Z fixed and 
carrying X to & is a ring homomorphism of Z[X] onto Z(I'(€)). Let P/” be the 
image of P/ under this homomorphism, i.e., let P” = pZ(T(€))+ Fi (€)Z(L (€)). 
This is an ideal. The composite ring homomorphism of Z[X] onto Z(T(&))/P/” 
factors through to a ring homomorphism of Z[ X]/P/ onto Z(T(€))/P;". Since the 
domain is a field and the identity maps to the identity, the homomorphism is one- 
one and the image is a field. Thus P.” is a prime ideal, the order of Z(I'(&))/P.” 
is p/', and and P’ is the complete inverse image of P/’. Since the ideals P/ can 
be recovered from the P,” and since the P/ are distinct, the P.” are distinct. 

The next step is to compare the ideals []#_, P,' and (p)R. We shall use the 
fact that the polynomial Wes F,(X)% — F(X) in Z[X] has coefficients divisible 
by p and therefore lies in pZ[X]. The computation 


& & 
it Pi = IT (pR + Fi(E)R)% 
& 
C pR+ |] FiG)R 


< pR+(I] Fi — FYE) since FEE) =0 
r=) 


C pR + pZ(V(é)) since [][*_, F;(X)% — F(X) lies in pZ[X] 


shows that []_, P” © (p)R. If we can show that N([]#_, P’) = N((p)R), 
then Proposition 5.4b will allow us to conclude that []j_, P’ = (p)R. 


3. Dedekind Discriminant Theorem 277 


At this point let us specialize to the case that Z(['(€)) = R and see how to 
complete the proof. Under this assumption the definitions of P; and P/" exactly 
match. What we have shown about the P,” thus says that the P; are distinct prime 
ideals in R with |R/P;| = p/‘, hence with dimp, (R/P;) = fi. Use of Proposition 
5.4 and the fact that |Z(I' (€))/P,”| = p# gives N(T]/_, P.“) = TL, N(R) = 
[payers prui=%si = p", the last equality holding because deg F(X) = 
>, e; deg F;(X). Since p” equals N((p)R), the desired equality of norms has 
been proved. This completes the proof of the theorem when Z(T'(€)) = R. 


We interrupt the general proof for the promised lemma. When we apply 
the lemma to finish the proof of Theorem 5.6, we shall take A = Z(I'(é)), 
J = J(€), andm = p. The hypotheses of Theorem 5.6 show that the condition 
GCD(p, J(&)) = 1 is satisfied. 


Lemma 5.7. Suppose that A is an additive subgroup of finite index J in R and 
that m > 1 is an integer relatively prime to J. Then for each € R, there exists 
ae Awithr —ainmR. 

PROOF. Let {u1,..., U,} be a Z basis of R, and let {v),..., v,} be a Z basis of 
A. We can write vj = Sy c;ju; for an integer matrix [c;;] with | det[c;;]| = J. 
Let r = )~"_, bju; be given, and let the unknown a € A be expanded as a = 
Via1 Gvj- Then a = 5°; ; ajc;juj, and we are to arrange that the element 


n n 
r-a=)) (bi — Yo cijaj)ui 
i=l j=l 
is in mR. Thus we are to arrange that each coefficient of a u; is divisible by m. 

Since | det[c;;]| = J is relatively prime to m, the system of linear equations 


n 
d cijaj = b; mod m 
j=l 


with unknowns a1,...,d, has a nonsingular coefficient matrix modulo m and 
therefore has a solution. 


SECOND PART OF PROOF OF THEOREM 5.6. The ring homomorphism of Z(T(&)) 
into R/(pR + F;(€)R) given by the composition of the inclusion followed by the 
quotient map descends to a ring homomorphism 


ZV (E)) / PLU (E)) + Fi EVZP(E))) — R/(pPR+ FER). (#) 


To see that () is onto, letr € R be given. Take A = pR in Lemma 5.7. Choose 
z € Z(T(&)) by the lemma in such a way that z —r is in pR. Under the mapping 


278 V. Three Theorems in Algebraic Number Theory 


(x), the coset of z goes tor +(z—r)+pR+F;(€)R =r+pR+F;(&)R, which 
is the coset of r. Hence (x) is onto. 

To see that (+) is one-one, suppose that z maps to the 0 coset in the image. 
Then z = pr, + F;(€)r2 with r; and rz in R. Lemma 5.7 produces z in Z(T(&)) 
with rz — z2 in pR. Hence the decomposition z = pr; + F;(&)(r2 — 22) + Fj (€)z2 
exhibits z asin pR + F;(€)Z(1(&)). The product Fj (€)Z(1(€)) is in Z(T(&)), 
since Z(I'(&)) is aring, and («) will be one-one if we show that pRN Z(T'(&)) © 
pZ(I(&)). Let {u;} be a Z basis of R, let {v;} be a Z basis of Z(I'(&)), and 
write vj = >); cijui for integers c;j. If z’ is in pRN Z(((é)), let us write 
z’ =o, ajvj. Substitution gives z’ = 0; (30; ajcij)ui. Since z’ is in pR, we 
see that x cjjaj; = 0 mod p for alli. The determinant of [c;;] is the index J(&), 
up to sign, ‘and this by psenripacn is not divisible by p. Therefore a; = 0 mod p 
for all j, and it follows that z’ is in pZ(T'(&)). Hence («) is one-one. 

We have thus proved that (>) is a ring isomorphism, i.e., that Z(T(&))/P/” = 
R/P; for all i. The left side is a field, and hence P; is a prime ideal. From 
the isomorphism we obtain N(P;) = |Z((&))/P/"| = p/. The computation 
N(T1%, P*) = TH N(R)? = TR, ptt = ph eie = p” in the last 
paragraph of the first part of the proof is now fully justified, and we can therefore 
conclude as in the special case that []_, P,’ = (p)R. 

Finally we have to prove that the ideals P; are distinct. If indices i # j are 
given, we know that P;” 4 P”. Choose z in P/’ but not P’. Then z is in P; 
because P.” C P;, and z is ae in P; because the proof shove that () is one-one 
showed that Z(T (é)) OP) S Pi This completes the proof of Theorem 5.6. 


PROOF OF THEOREM 5.5 WHEN p IS NOT A COMMON INDEX DIVISOR. If p is not 
a common index divisor, we can choose a primitive € for K/Q such that é is in 
R and p does not divide J(€) = |R/Z(T'(&))|. Let F(X) be the field polynomial 
of € over Q. Since D(é) = J(&)?Dx, p divides Dx if and only if p divides 
D(é). Thus p divides Dx if and only if p divides the discriminant of F(X). 
This happens if and only if the discriminant of F(X) is = 0 mod p, if and only 
if F(X) has a root of multiplicity > 1 in an algebraic closure of F,,, if and only if 
the factorization over F, of F(X) as a product of powers of distinct irreducible 
monic polynomials has some factor with exponent > 1. Applying Theorem 5.6, 
we see that this last condition is satisfied if and only if the unique factorization 
of the ideal (p)R in R as Ws Pe has some e; > 1. 


As was mentioned earlier in this section, the difficulty in proving Theorem 5.5 
in complete generality is that we lack sufficient tools for addressing questions by 
localization. The different prime numbers are interacting in some fashion, and the 
above proofs were unable to separate them. The usual technique of localization 


4. Cubic Number Fields as Examples 279 


in our situation'® suggests enlarging one or the other of the rings Z and R by 
adjoining inverses for all elements not in some prime ideal of interest. Then we 
piece together the results. If the localizing is done with respect to a prime ideal 
(p) of Z, then Z gets replaced by the subring S~'Z of all members of Q with no 
factors of p in the denominators, and R gets replaced by S~'R. One advantage 
of this procedure is that S~! R is a principal ideal domain, whereas R is typically 
not such a domain. 

Localization in that formulation does not by itself reveal a clear path to a proof 
of Theorem 5.5. Two additional ideas enter the argument to make a path seem 
natural; Dedekind succeeded without the second of them, and historically it is 
only with hindsight that one sees the benefit of the second idea. The first idea is 
to use a more fundamental object than the discriminant of K, called the “relative 
different” of K/Q; this makes it possible to aim for a more precise description 
of the ramification indices when they are not equal to 1. The second idea is due 
to K. Hensel and involves forming a kind of completion of the localized rings; 
the ring Z gets replaced by the ring Z, of “p-adic integers,” and the field Q 
gets replaced by the field Q, of “p-adic numbers.” We return to these ideas in 
Chapter VI. 


4. Cubic Number Fields as Examples 


In treating examples of cubic fields, it will be convenient to have one further 
tool available for computing discriminants. Let K be a number field, let € be 
a primitive element of K/Q, and let F(X) be its field polynomial over Q. Let 
& = 0; (&) be the conjugates of &, and assume that &; = €. The conjugates are 
the roots of F(X) in C, and hence 


F(X) =| [( -&). 
i=1 


The derivative is F’(X) = )~"_, I] zi (X — §;), and therefore 


F'(é) =|] & -§). 
j=2 
Observe that the form of the left side shows that this element lies in K, and it 
lies in R if € lies in R. The different D(é) of the element é is defined to be this 
element of K, namely!! 


10] ocalization was introduced in Section VIIL.10 of Basic Algebra. 
'l The different of an element is related to the notion of relative different mentioned at the end of 
Section 3, but the nature of that relationship will not concern us at this time. 


280 V. Three Theorems in Algebraic Number Theory 


Dé) = Fé) =|] € -§). 


j=2 


Since F’(X) has coefficients in Q, the conjugates o;(F’(€)) of F’(é) are the 
elements F'(o;(€)) = F’(&;) for 1 < i < n. The formula for F’(X) shows that 
F'(&) =|; (& — &). Therefore the norm of D(&) is 


Nx/o(D()) = Nx o(F'@)) =| [F’'& =[][[]G@- 
i=l i=l j#i 


=I"? [1G =-§Y =C1y" PD). 


i<j 


In other words, the norm of the different of & is, up to sign, equal to the discriminant 
of I'(€), which in turn equals the discriminant of the field polynomial of the 
primitive element €. The definitions of D(é) and D(é) and the formula connecting 
them make sense if € is allowed to be any element of K, primitive or not. Both 
D(&) and D(é) have the property of being nonzero if and only if € is primitive. 


EXAMPLE. For the field K = Q(/2), the different of & = /2is3X*|\_ 35 = 


3 4/4, and the discriminant of X> — 2, up to the sign (—1)*/, is the norm of this, 
1.€., 


D(V2) = -BV4)3V40)3V40"), where w = e77'/3, 
= —3°2”, 


Alternatively, the norm can be computed from a field polynomial. Specifically 
the norm of 3/4 is the determinant of left multiplication by this element when 
considered as a Q linear mapping of K into itself. 


We saw already in Example 2 of Section 2 that D(</2) = —372, but the 
earlier method of computation was longer. At the end of Section 2, we saw in 
addition that {1, /2, </4} is a Z basis of the ring of algebraic integers in the field 
K = Q(v/2). The use of differents does not simplify the proof of this latter fact. 

In this section we consider further examples of cubic extensions of Q. The 
first such fields that we study are the pure cubic extensions K = Q(/m ), where 
m is any cube-free positive integer > 1. Already with these fields KX, we shall see 
that Dx is not necessarily equal to D(€) for some algebraic integer €. However, 
all these fields have no common index divisors. Then we examine Dedekind’s 
example of a cubic number field for which 2 is a common index divisor. 


4. Cubic Number Fields as Examples 281 


The correspondence of cube-free integers m > 1 to fields Q(X/m ) is many- 
to-one: if m is given and p is a prime dividing m, let m’ = m/p if p? divides m 
and m’ = mp if p? does not divide m; then Q(3/m) = Q(+/m’ ). In analyzing 
Q(/m ), it will be convenient to normalize matters so as to resolve this ambiguity. 
We can write m uniquely as a product m = ab? for positive square-free integers 
a and b; these have GCD(a, b) = 1, b? is the largest square dividing m, and a is 
given by a = m/b?. Then m and m’ = ab lead to the same field. 


Proposition 5.8. For a cube-free integer m > 1, let K = Q(</m), and let R 
be the ring of algebraic integers in K. Write m = ab? for positive square-free 
integers a and b with GCD(a, b) = 1, and define two members of R to be the 
real cube roots 6; = Jab? and = Va2b. Then a Z basis of R consists of 

(a) {1, 0), 02} if a #4 +b mod 9, ie., if m is of Type I, 
(b) {31 + 6; + 62), 1, 62} for exactly one choice of the pair of signs if 
a = +b mod 9, ie., if m is of Type I. 


In the respective cases the field discriminant is given by 


S —27a*b* if m is of Type I, 
| -3a2b? if mis of Type IL. 


REMARKS. More precisely in Type II, the congruence a = +b mod 9 implies 
that a and b are prime to 3. Choose signs s = +1 and t = +1 such that 
sa = 1 mod 3 and tb = 1 mod 3. Then the first member of the Z basis is to be 
F(1 + 56; + t@2). The smallest m leading to Type I is m = 2, and this case was 
examined in Example 2 in Section 2. The smallest m leading to Type lism = 10, 
and then the first member of the asserted Z basis of R is z(1 + x/10 + */100 ). 


Proor. Let w = e27'/3, The conjugates of 6, can be taken to be 0; (6;) = 4, 
02(01) = w6;, and 03(61) = 6). Since 6? = b62, we have o;(62) = b~'0;(1)’, 
and therefore 0; (62) = 62, 02(62) = w’6, and 03 (02) = w6. In view of the 
formula before Proposition 5.2, D((1, 41, @2)) is the square of 


1 1 1 
det @ wo} os ; 
6. wr wor 
and we calculate that D((1, 6;, 62)) = —27a7b’. 

Let us apply Proposition 5.2 to the triple {1, 0;, 62} of members of R. For each 
prime p dividing 27a”b’, we are to check whether certain elements are integral. 
First suppose that p divides a but p 4 3. It is enough to check the elements 
po (ao + 61) or pe (ao + a6; + 62) for integrality when ao and a; are integers 
from 0 to p — 1. Form the extension L = K(3/p) = Q(i/n, 3/p ) of K, and 


282 V. Three Theorems in Algebraic Number Theory 


let T be its ring of algebraic integers. The degree [IL : Q] equals 9 if L A K and 
equals 3 if L = K. If p (ao + 4)) is integral, then ay + p'?((a/p)b*)'? = pr 
with r € R, and hence dp = p!/3c with c € T. Applying Ni/Q to both sides, we 
obtain aj = p> Nijo(c) if L # K, and we obtain a2 = pNxyo(c) if L = K. In 
either case, p divides ay, and ay = 0. So p~'0, is integral, in contradiction to 
the facts that the field polynomial for K of p~'6, is X3 — p~ab? and that ab? 
contains p as a factor only once. We conclude that p~'(ao + 01) is not integral. 

Similarly if the element p'(ao + a1; + 62) is integral, then we see that 
do ta, p'? ((a/p)b*)'? + p23 ((a/p)*b)'? = pr withr € R. Soag = p!/?c 
with c € T, and the same argument as above shows that a9 = 0. Hence 
a,((a/p)b*)'? + p!3((a/p)*b)!? = p*r, and a, ((a/p)b?)!7 = p'/Fc! with 
c’ € T. Taking the norm gives a}((a/p)b’)? = p>Nig(c’) if L # K and 
a}(a/p)b* = pNx/o(c’) if LL = K. Since a/p and b are prime to p, we conclude 
that p divides a; in both cases. Therefore a; = 0, and p~!6, is integral. The 
field polynomial for K of p~'62 is X? — p~?a?b, and ab contains p as a factor 
only twice. We conclude that p'(ap + a0, + 42) is not integral. 

This disposes of the prime divisors of a other than p = 3, and we handle 
the prime divisors of b other than p = 3 in the same way, except that we start 
from the ordered triple (1, 62, 6;) and therefore need check only py (ag + 62) 
and p~'(ao + aj0 + 94). 

Now let us apply Proposition 5.2 to the ordered triple (1, 0), 62) for the prime 
p = 3, except that we allow coefficients 0 and +1 instead of 0, 1,2. We check 
integrality for the elements }(1 + 61), +(1 £2), (0; 4), and $(1 + 6; + 6) 
by checking whether the coefficients of their field polynomials are in Z. For the 
first two, let g be +6) or +62. The coefficient of the first-degree term in the field 
polynomial of F(1 + @) is 5 times 


(1+) +o9)+ (1+ 9) +’) + (14+ a9) + 09) 
= (1+ 9)(2+ a9 + og) + (1+ ag)(1 +09) 
=(1+9)2-9)+(1-9+9)=2+9-¢+1-94+¢ =3, 


hence is i. This is not an integer, and thus (1 + ) is notin R. If g = +6; and 
w = +62, then the corresponding computation for g + y is 


(9+ P(g toh) + (G+ Weg +o) + (og +o P(g +o) 


=-(~¢+WetwWwt+@-ov+’) 
= —3p = —3ab(sgn¢)(sgny), (*) 


and 5 of this is an integer only if 3 divides ab. In this case our hypotheses show 


4. Cubic Number Fields as Examples 283 


that 9 does not divide ab. The constant term in the field polynomial of iy +W) 
is —+ times 
Y+WopteW@ytop=e tw 

= (seng)ab* + (sgn y)a*b 

= ab(bsgng +asgnyp). (>) 
When 3 divides ab exactly once, 3 divides (**) exactly once, and hence —+ of 
(+k) 1s not an integer. Thus L(y + w) is notin R. 

It remains to check F(1 +g+w) with g = £6; and y = +62. The coefficient 

of the second-degree term in the field polynomial of F(1 + 9+ wp) is equal to 


—4 Trl + @+ Ww) = —1 and is an integer; thus it imposes no restrictions. The 
first-degree term of the field polynomial is , of 


(l+g+Wd+og+oW)++e+W+o°¢+ oy) 
+(l+op +o p)(1+o°9 + op) 


=(1+9+W2-9-wW+d-¢9-vt+¢'-ovt+y’) 
= 3 —39w = 3(1 — ab(sgng) (sgn p)), (+) 


and 5 of (+) is an integer if and only if ab = (sgn q@) (sgn w) mod 3. In particular, 
the proof is now complete unless ab = (sgng)(sgn yy) mod 3. Thus we may 
assume from now on that neither a nor b is divisible by 3. 

The constant term of the field polynomial of Ae +@+ yp) is— + times 


(+ e+ Wl +og + op) +09 + oy) 
=1+TrKo(y + ¥) + (*) + Ce) 
= 1+0-— 3ab(sgng)(sgny) + ab(bsgng + asgny). 
Put a = asgng and 6 = bsgny, so that 1 — 3a6 + aB(a + B) is to be divisible 
by 27. Since neither 6 nor a is divisible by 3, we can define / mod 27 by the 


congruence 8 = Ja mod 27. Substituting shows that 1 — 3la? + la? (a + la) = 
0 mod 27, hence that /(/ 4 las = 3la2 — 1 mod 27, which we can rewrite as 


atl? + (a? — 3a’) + 1 = 0 mod 27. 


Completing the square in / allows us to write this congruence as 
(i+ 4(1 —3a7!))? = $1 — 3a7!)? — aw? mod 27. 
Factoring the right side, we obtain 


(1+ 4(1 —3a7'))* = fa “[a(a — 1)°(@ — 4)] mod 27. Gap) 


284 V. Three Theorems in Algebraic Number Theory 


Ifa = 1 mod 3, the expression in square brackets on the right side is = 0 mod 27, 
and 0 is the square of 0 and +9. If ~@ = 2 mod 3, then the expression in square 
brackets is a square if and only if a(a@ — 4) = c? mod 27. Considering the 
congruence only modulo 3 gives 2(—2) = c* mod 3 and therefore c? = 2 mod 3, 
which has no solutions. Thus w = 2 mod 3 leads to no solutions of (+7). We can 
summarize by saying that the solutions of ({+) are given by aw = 1 mod 3 and 


1+ 5(1 —3a7') =0 mod 9. 


One checks that the values a = 1, 4, 7 mod 9 all lead to] = 1. 
Let us summarize. Let s and ¢ be signs +. Then F(1 + s0, + 162) is integral 
if and only if both of the following conditions are satisfied: 
(i) sa =tb=1 mod3, 
(ii) sa = tb mod 9. 
When these conditions are satisfied, we are in Type I; otherwise we are in Type I. 
This completes the proof. 


In the setting of Type I in Proposition 5.8, let us form the discriminants of 
T(6;) = (1, 61, 67) and '(@2) = (1, 62, 03). Using the method of computation 
at the beginning of this section, we see that the differents in the two cases are 
36? and 305. Therefore the discriminant of (6) is D(@,;) = —Nx/Q(367) — 


—33(67)3 = —33(ab*)? = —33a*b4, and the discriminant of (62) similarly is 
D(@2) = —33a*b*. The absolute value of the greatest common divisor of these 
two expressions is 3°a7b* = |Dx|, and therefore there are never any common 


index divisors in Type I. 

On the other hand, there exist situations in Type I in which no primitive element 
E of Q(4/m) lying in R has ['(€) as a Z basis. To prove this fact, we make use 
of the following proposition. 


Proposition 5.9. For a pure cubic extension K = Q(Vab?) of Type I, an 
element € = x + yO; + 2602 with Z coefficients has D(€é) = Dx if and only if 
yb—-a=Hl. 


PROOF. The matrix whose determinant is D(I'(&)) is given by 
3 Tr(é)  Tr(é*) 
M= ( Tr(€)  Tr(é*) me) ; 
Tr(€?) Tr(é*)  Tr(é*) 


where Tr is short for Trx/g. The element OH has conjugates OH , wi t7Jg! oH : 
and w”*/6163, where @ = e77'/3, Thus 


Tries) = (+a!) + wei = (1 + wt) 4 oD) 66! | 


4. Cubic Number Fields as Examples 285 


This is 0 if i + 27 is not divisible by 3 and is 30/05 otherwise. We compute the 
trace of each power of & by applying the formula 


TrE!) = 32 ()x!-* Tr((yO, + 26)4), 
k=0 


which comes from treating € as a binomial. The traces of the powers of y1 + z62 
work out to be 


4 Tr(y0 + 262) = 0, 
5 Tr((y01 + 202)”) = 2yz06 = ab(2yz), 
5 Tr((y01 + 262)°) = ab(y*b + za), 
+ Tr((y0, + 202)*) = (ab)*6y?2”. 
Substituting, we find the following formulas for the trace of each power of &: 
5 Tr(&) = x, 
+ Tr(&*) = x” + 2(ab)yz, 
£Tr(E*) = x? + 3x(ab)2yz + (ab)(y°b + z3a), 
; Tr(é*) = x* + 6x" (ab)2yz + 4x(ab)(y*b + 23a) + (ab)*6y"z?. 
The matrix M is therefore of the form 
1 x x7 +A 
ju =( x aA ven), 


x7+A x9 +B x4*4+C 


where 
A = 2(ab)yz, 
B = 3x(ab)2yz + (ab)(y*b + 27a), 
C= 6x*(ab)2yz + 4x (ab)(y*b Aa) (ab)6y*z. 
Expansion of det iM results in an expression that simplifies to 
det $M = AC + 2xAB — 3x’A? — A? — B’. 


Thus we have only to substitute. The resulting expression simplifies greatly, and 
we obtain det iM = —(ab)*(y*b — z3a)?. Consequently 


D(é) = —3° (ab)? (y°b — 2a)’. 


Since Proposition 5.8 has shown that Dk = —33(ab)*, the result follows. 


286 V. Three Theorems in Algebraic Number Theory 


Thus in order to give an example of an m for which no € has D(é) = Dx, we 
have only to select a and b for which the Diophantine equation y*b — z3a = 1 
in y, z has no solution. Choose a = 7 and b = 5, so that m = ab* = 175. To 
verify that the Diophantine equation has no solution, take the equation modulo 
7 and then modulo 5, obtaining Sy? = 1 mod 7 and —7z* = 1 mod 5. These 
congruences say that y> = 3 mod 7 and z? = 2 mod 5. The only cubes modulo 
7 are +1, and thus the congruence for y has no solution. 

We turn to the question of the splitting of prime ideals in pure cubic extensions 
K = Q(4/m). In the notation of Proposition 5.8, we again write m = ab’, and 
we shall assume that the extension is of Type I. We saw in Proposition 5.8 and 
the remarks afterward that Dx equals the greatest common divisor of D(/ab? ) 
and D(a2b ). Therefore the splitting of every prime ideal (p) in Z is described 
by Theorem 5.6. We have only to sort out the details. 


Proposition 5.10. Let K = Q(¥/m ) be a pure cubic extension of Type I, and 
let R be its ring of algebraic integers. If p is a prime number, then the ideal (p)R 
of R splits into prime ideals as follows: 

(a) (p)R = P,P) with N(P,) = p and N(P2) = p* if p = —1 mod 3 and 
Pp does not divide Dx, 

(b) (p)R = P, P2P3 with P;, P2, P3 distinct of norm p if p = | mod 3, 
x? =m mod p is solvable in F,,, and p does not divide Dx, 

(c) (p)R is prime of norm p? if p = 1 mod 3, x? = m mod p is not solvable 
in F,,, and p does not divide Dx, 

(d) (p)R = P? with N(P) = p if p divides Dx. 


PROOF. The prime divisors of Dx are 3 and the prime divisors of a and b. 
For all other primes Theorem 5.6 shows that all ramification indices are 1. Let 
p be a prime of the form 6k + 1 not dividing Dx. The multiplicative group F 
of F,, is cyclic of order p — | and hence has order divisible by 3 if and only if 
p = 6k + 1. Thus there are three cube roots of 1 when p = 6k + 1 but only 1 
when p = 6k — 1. In the latter case the cubing map is one-one onto from F 


to itself. Thus X* — m factors modulo p as the product of a first-degree factor 
and an irreducible second-degree factor if p = 6k — 1, and (a) follows for such 
primes from Theorem 5.6. If p = 6k + 1, then X? — m either factors modulo p 
as the product of three first-degree factors or is irreducible, since 1 has three cube 
roots. Thus (b) and (c) follow for such primes from Theorem 5.6. 

For p = 2 if m is odd, then X? —m = X?—1 = (X —1)(X* +X +1) mod 2, 
and we are in the situation of (a). This completes the discussion of primes that 
do not divide Dx. If p divides m, then X* — m = X? mod p is the cube of a 
first-degree factor, and (d) follows in these cases. For p = 3 whether or not p 
divides m, we have X? — m = X?* — m? = (X — m)? mod 3, and (d) follows in 
this case. 


4. Cubic Number Fields as Examples 287 


We conclude this section by discussing Dedekind’s example of a common 
index divisor. The field in question is again of degree 3 over Q but is not of the 
form Q(</m ). Instead, the field is K = Q(&), where & is a root of F(X) = 
X34 X?—2X +8. The polynomial F(X) is irreducible over Q because Gauss’s 
Lemma shows that its only possible linear factors are X — k with k dividing 8 
and because routine computation rules out each such linear factor. As usual, let 
R be the ring of algebraic integers in K. 

The different of € is D(é) = F’(E) = 3€* + 2& —2, and the discriminant D(é) 
therefore is given by D(€) = —Nxjg(3é 249£—2). We calculate this norm as the 
determinant of left multiplication by 3&7 + 2é — 2 on K, using the ordered basis 
(1, &, €*). Since €? = —&? +. 2¢ — 8 and &* = —£34 2&2 — 8& = 3&7 — 10€ +8, 
we have 


(3&7 + 2& — 2)(1) = —2 + 2€ + 38”, 


(3&7 4+ 2& — 2)(E) = —2E + 28? 4+ 36? = —24 4 48 — é?, 
(38? + 2& — 2)(E*) = —2£? + 287 4 364 = 8 — 26 + 58”. 


Thus 


229: 394 8 
Nx/q(3&* + 2& — 2) = det ( i 4 26) = 27 . 503, 
3-1 5 


and D(é) = —2 - 503. Thus either the index J(€) of Z(I(é)) in R is 1 with 
Dy = —2? - 503, or J(€) = 2 with Dx — 503. 

Problems 24—25 at the end of the chapter show that s(E 24 &) is in R and 
that consequently the correct choice is J(§) = 2 with Dk = —503 and with 
{1,&, 5 (7 + &)} as a Z basis of R. In fact, 2 divides J(7) for every primitive 
element of K lying in R, and therefore 2 is a common index divisor in the sense 
of Section 2. One way to check this assertion would be to calculate D(n) for 
every such 7. The computation would be feasible because we can express 7 as a 
Z linear combination of the members of {1, &, s(E 2 + &)} and calculate the field 
polynomial of 7 in the same way that Nx/g(&) was calculated above. 

However, there is an easier way. Problem 28 at the end of the chapter shows 
that (2)R splits as the product of three distinct prime ideals of R. If there were 
some 7 for which 2 did not divide J(n), then Theorem 5.6 would show that the 
minimal polynomial of 7 when reduced modulo 2 splits as the product of three 
distinct first-degree factors. But F) has only 2 elements, hence only two possible 
distinct linear factors to offer. Thus Theorem 5.6 must not be applicable to 7 and 
the prime 2, and we conclude that 2 divides J(7). Going over this argument, we 
see that we have established the following more general result. 


288 V. Three Theorems in Algebraic Number Theory 


Proposition 5.11. Let K/Q be a field extension of degree n, and let R be the 
ring of algebraic integers in K. If p is a prime number with 2 < p <n — 1 such 
that (p)R splits as the product of n distinct prime ideals of R, then p is acommon 
index divisor for K. 


5. Dirichlet Unit Theorem 


Let K be a number field of degree n over Q, and let R be its ring of algebraic 
integers. We regard K as a subfield of C. The units of K are understood to 
be the members of the group R* of units of the ring R. As was observed in 
Section 2, there exist exactly n field mappings of K into C, and we denote them 
by o1,..., On; one of these is the inclusion of K into C. If x is in K, then the 
images 01(X),..., O,(x) are called the conjugates of x. 

In Section I.6 we studied the group of units in the quadratic case n = 2, 
and we found, particularly in the problems at the end of that chapter, that an 
understanding of this group was essential to working successfully on the number- 
theoretic problems studied in that chapter. When n = 2, we found that the 
qualitative nature of the group R* depends on the sign of the field discriminant. 
The group turned out to be the finite subgroup of roots of unity in K if Dx < 0, 
and it turned out to be isomorphic to the product of a copy of Z and a cyclic group 
of order 2 if Dx > 0. The hard step in this analysis was constructing an element 
in the subgroup Z in the latter case. 

Because of the importance of R* in the quadratic case, we can expect that an 
understanding of R* for our general number field K is important for higher-degree 
number-theoretic questions. In this section we shall obtain a structure theorem 
for R* for general n analogous to the structure theorem for n = 2 mentioned in 
the previous paragraph. Such a theorem may not answer all important questions 
about R™, but it will be a good start.2 The main theorem is Theorem 5.13 below, 
the Dirichlet Unit Theorem. 

The units of R are the members ¢ of R with Nxjg(¢) = +1. This simple fact 
is verified for general IK in the same way that it was verified for quadratic IK in 
Section I.6. 

Any element ¢ of finite order in R* is a complex number with e = 1 for 
some k and hence lies on the unit circle of C. Since such an element ¢ is a root 
of X* — 1, all its conjugates o;(¢) lie on the unit circle of C. We shall prove the 
following proposition about these elements. 


!2For example, when n = 2, we defined the fundamental unit ¢; for the case Dx > 0 to be the 
least unit > 1, and the sign of Nx/g(€1) was a thorny question that we did not answer fully but that 
affected results in the problems at the end of the chapter. 


5. Dirichlet Unit Theorem 289 


Proposition 5.12. The subgroup of R* of elements of finite order consists of 
all / roots of unity in C, where / is an integer depending on K that is bounded 
when the degree n = [K : Q] is bounded. 


PROOF. We are to bound the integers k for which primitive k™ roots of unity 
occur in K. Let k have prime decomposition k = pj''--- p}"". From Section 
IX.9 of Basic Algebra, we know that the cyclotomic polynomial ®;(X) is a 
monic irreducible member of Z[X] whose roots in C are exactly all primitive k™ 
roots of unity; moreover, the degree of ®;,(X) is given by the Euler g function: 


gk)=k J] (1-34). 
p divides k 
If primitive k" roots of unity occur in K, then g(k) < n because ®;(X) is 
irreducible over Q, and hence (p; — 1)---(p, — 1) < n. Allowing p; = 2 
possibly, we see that each factor p; — 1 with j > 1 is at least 2, and thus 
2'-! <n. Sor is bounded as a function of n by log, 2n, and we obtain 
gky=k JT] (1-5) =2- ema f. 
first log, 2n 
primes 

Consequently k < 2ng(k) < 2n?, as required. If R* contains one primitive k™ 


root of unity in C, then it contains them all, since the k™ roots of unity form a 
cyclic group and any primitive such root is a generator. The result follows. 


We shall use the field mappings o; : K > C for 1 < j <n to introduce useful 
“absolute values” on K. The mappings o; are of two types: 
(i) those carrying K into R, 
(ii) those carrying K into C but not into R; these come in pairs o and o, 
where o denotes the composition of o followed by complex conjugation. 


Suppose that there are r; mappings o; of the first kind and that there are rz pairs 


of the second kind. Then r; + 2r2 = n. Renumbering 01, ..., 0, if necessary, 
let us arrange that 0;,...,0,, are of the first kind, that o,,41,..., 0, are of the 
second kind, and that o,,4,,4; = 0,,4; for 1 <i < ry. We introduce r; + rz 


absolute values!* on K by the definition 
IIxlls =los@)| = forl ss <rp+ro, 


where | - | denotes the usual absolute value function on C. Then the function 
Log : KX — R"*” given by 


Log(e) = (log |lell1, ..-, log lle ll-,475) 


'3These are called archimedean absolute values of K in the general theory. Some authors refer 
to them as archimedean valuations. 


290 V. Three Theorems in Algebraic Number Theory 


is evidently a group homomorphism. 

A lattice in a Euclidean space R’ is an additive subgroup Zu; @ - --@ Zu; such 
that {uw ,..., u;} is linearly independent over R. Such a subgroup is discrete,'4 
and the quotient is compact, by the Heine—Borel Theorem. 


Theorem 5.13 (Dirichlet Unit Theorem). Let K be a number field of degree n 
with r; +12 absolute values, and let R be the ring of algebraic integers in K. The 
kernel of the restriction to R* of the function Log is the finite subgroup of roots 
of unity in K%*, and the image of this restriction of Log is a lattice in the vector 
subspace of elements (x1, ..., X;;+r,) in R"*” satisfying 


Xy tee +X, + 2X41 +++ +2%,-,4,, = 0. 


Consequently R* is a finitely generated abelian group of rank r; + rz — 1. 


EXAMPLES. 


(1) The theorem reduces when n = 2 to results known from Chapter I. 
Specifically if K = Q(./m), then m > 0 makes r; = 2 and rz = 0, while 
m <Omakesr; =O andr) = 1. 


(2) For K = Q(/2), let w = e?”'/3. The field mappings of K into C carry K 
into R or Rw or Rw*. Thus r; = 1 and ry = 1. 


(3) The polynomial F(X) = X°—5X +1 in Q[X] was studied as an example in 
connection with Galois theory in Section IX.11 of Basic Algebra. The polynomial 
was shown to be irreducible over Q and to have three real roots and one pair of 
complex conjugate roots. For K = Q[X]/(X* — 5X + 1), we therefore have 
rj = 3andr = 1. The primitive element € of K with > — 5 + 1 = 0 lies in 
R; it is a nontrivial example of a member of R* because &(€ #25) i 


The proof of Theorem 5.13 will occupy the remainder of this section. We 
begin by clarifying in Lemma 5.14 the relationship between discrete subgroups 
and lattices in Euclidean space and by proving in Proposition 5.15 a weak version 
of Theorem 5.13 that addresses everything except the existence questions. 


Lemma 5.14. A discrete subgroup of R’ is a free abelian group of rank < / 
and is necessarily of the form Zu; ®--- @ Zu,, for some set {uw ), ..., U»_} that is 
linearly independent over R. The discrete subgroup is a lattice if and only if the 
rank is /. 


'4 4 discrete subset of R’ is a subset S such that every one-point subset of S is open when S is 
given the relative topology. See Lemma 5.14 below for a converse assertion. 


5. Dirichlet Unit Theorem 291 


PROOF. We begin by proving that any discrete subgroup of R’ is topologi- 
cally closed. Let G be the subgroup, and choose by discreteness an open ball 
V={xeR | |x| < €} V about 0 with VG = {0}. The open ball U = 
{x € R! | |x| < €/2} has the property that U + U C V. If G is not closed, let 
Xo be a limit point of G that is not in G. Then the open ball x9 — U about x9 
must contain a member g of G, and g cannot equal x9. Write x9 — u = g with 
u €U. Then u = xo — g is a limit point of G that is not in G, and we can find 
g’ £1 inG such that g’ isinu+U. Butu+U CU+U CV, and so g’ is in 
GN V = {0}, contradiction. We conclude that G contains all its limit points and 
is therefore closed. 

From the fact that any discrete subgroup G of R’ is closed, let us see that any 
bounded subset of G is finite. It is enough to see that the intersection X of G with 
any (finite-radius) closed ball is finite. The set X is closed because G is closed, 
and it is therefore compact by the Heine—Borel Theorem. By discreteness, find 
for each g € G an open ball U,, centered at x that contains no member of G other 
than x. These open sets form an open cover of the compact set X, and a finite 
subcollection of them covers X. Each such open set contains only one member 
of X, and hence X is finite. 

Returning to the statement of the lemma, we induct on the dimension of the 
R linear span of the discrete subgroup, the base case being that the R linear span 


is 0. Let G be the discrete subgroup, and let {v;,..., Um} in G be a maximal set 
that is linearly independent over R. Let Go = GN ear Ruj). By induction 
we may assume that every u € Go is a Z linear combination of v1, ..., Un—1. Let 
S be the set of R linear combinations of {v,,..., Um} of the form 
= = 0<c <1forl<i<m-—1l, 
$= {v=o te +entm EG| goo ey . 


The set S is bounded, and we saw in the previous paragraph that any bounded 
subset of G is finite. So S is finite. Let v’ be a member of S with the smallest 
positive coefficient for v,,, say 


/ 
V = avy +--+: +aAnVUym. 


If v is any member of S and its coefficient c,, is not a multiple of a,,, then v — jv’ 
for a suitable integer j has m"™ coefficient positive but less than a,,; by subtracting 
from v — jv’ a suitable Z linear combination v” of v,,..., Um—1, we can make 
v — jv’ — v” be in S, and then we have a contradiction to the minimality of 
Am. We conclude that c,, is always a multiple of a,,. Then v — jv’ is in Go for 
some integer j, and it follows that the Z linear combinations of v), ..., Um—1, v’ 
span G. This completes the induction and the proof of the first conclusion of the 
lemma. The second conclusion is an immediate consequence of the first. 


292 V. Three Theorems in Algebraic Number Theory 


For the remainder of the section, we adopt the notation in the statement of 
Theorem 5.13, and we shall not repeat it in the statement of every intermediate 
result. 


Proposition 5.15 (weak form of Dirichlet Unit Theorem). The kernel of the 
restriction to R* of Log is the finite subgroup of roots of unity in K*, and the 
image of this restriction of Log is a discrete additive subgroup in the vector 
subspace of elements (x1, ..., X;;4,,) in R"*” satisfying 


Xp tees bx, + 2x- 41 $e + 2X47, = 0. 


Consequently R* is a finitely generated abelian group of rank < rj +r2— 1. 


PROOF. For @ in R*, we calculate that 


log lla |]; +--+ + log lloe||-, + 2 log lola ee + 2 log lor |I +, +45 


= log (lo1(@)| +++ lor, (@) [lor 41)?» |r, 419 (7) 
= log | IT o;(o)| 
J= 


=> log |Nx/Q(a)| = log 1 = 0. 


Hence the image lies in the vector subspace in the statement of the proposition. 

Fix a (large) positive number M, and consider the set Ey of all members a 
of R* for which all coordinates of Log(a@) are < M in absolute value. Then the 
field polynomials 


det (XJ — (left by w)) = Il (X — 0;(a)) 
j=l 


of such elements @ have all coefficients bounded by some M’ depending on M, 
since each |o;(a)| is of the form ||@||; and is < e“. Such a field polynomial is 
equal to g(X)’, where g(X) is the minimal polynomial of a and r is given by 
r deg(g(X)) = n. Since a is in R, the coefficients of g(X) are integers, and 
hence so are the coefficients of the corresponding field polynomial. There are 
only finitely many members of Z[X] of degree n whose coefficients are in a given 
bounded set, and hence there are only finitely many w’s in Fy. 

It follows that the image subgroup is discrete. Taking M = 0, we see also that 
the kernel of the restriction of Log to R™ is finite. Hence every element of this 
kernel has finite order and is therefore a root of unity. 


5. Dirichlet Unit Theorem 293 


We come to the proof of Theorem 5.13. For quadratic extensions of Q, which 
were handled in Section I.6, the crucial question of existence was addressed by 
means of an approximation result (Lemma 1.15) for irrational numbers. That 
result did not immediately establish the existence of units of infinite order, but it 
was applied infinitely many times in the course of proving Proposition 1.16, and 
the total effect was to produce a unit of infinite order. 

We do something similar in general. In place of the approximation result 
in Lemma 1.15, we shall use a result known as the Minkowski Lattice-Point 
Theorem, which asserts the existence of lattice points in certain compact convex 
sets in Euclidean space. This result appears as Theorem 5.16 below. As was true 
in the quadratic case, it is not just a single application of this theorem that produces 
the desired units, but an infinite sequence of applications of it. The details will 
be more complicated here than in the quadratic case. Before describing how the 
argument is to proceed, let us establish the Minkowski theorem. 

Let {v1,..., Un} be an R basis of R”, and let L = Zu; ®--- B Zum be the 
corresponding lattice. The fundamental parallelotope for L corresponding to 
this basis is the set 


{c1ur +++ +¢mUm | 0 < cj <1 for1 < j <m}. 


The volume of this fundamental parallelotope is independent of the choice of the 
Z basis for L. In fact, any two such Z bases are carried from one to the other by an 
integer matrix of determinant +1, and any linear transformation from R” to itself 
of determinant +1 is volume preserving. The one fundamental parallelotope is 
mapped to the other when the one basis is carried to the other, and hence the two 
fundamental parallelotopes have the same volume. 


Theorem 5.16 (Minkowski Lattice-Point Theorem).!> Let L be a lattice in 
IR”, and let Vo be the volume of a fundamental parallelotope. If E is any compact 
convex set in R” containing 0, closed under negatives, and having volume(E) > 
2” Vo, then E contains a nonzero point of L. 


REMARK. The constant 2” in the statement is best possible, as is shown by 
taking L to be the standard lattice and E to be a cube oriented consistently with 
L, centered at 0, and having each side slightly less than 2. We need merely some 
constant, not the best possible one, in the application to Theorem 5.13, and the 
proof can be simplified a little for that purpose.'© But the present theorem will be 
applied again in the next section, and this time the best possible constant yields 
the most useful information. 


'SThe simple proof given here is due to H. Blichfeldt and is the standard one, so standard that 
Blichfeldt’s name is sometimes attached to the theorem. 

'6Tn particular, the final paragraph of the proof can be omitted, and we can fix a value of M 
proportional to s in making the argument. 


294 V. Three Theorems in Algebraic Number Theory 


PROOF. Without loss of generality, L is the standard lattice of points with all 
coordinates in Z, and Vo is 1. Fix an arbitrarily small positive constant €, and 
first assume that the given set E has volume(E) > (2+ €)Vo. Arguing by 
contradiction, suppose that the only lattice point in E is 0. Since E is bounded, 
we can choose a number s > 0 in such a way that E is contained in the cube 
Cs centered at 0, oriented consistently with the lattice, and having side 2s. Let 
us see that the sets / + SE for / € L are disjoint. In fact, in obvious notation if 
ly + 5e1 = ly + Se with ly A hy, then 1) — bh = 5(e2 — e1), and this is in E 
because e and —e, are in E and E is convex. Thus the sets J + SE are indeed 
disjoint. 

Choose an integer M large enough to have s/M < e€. Any lattice point / whose 
coordinates are all < M in absolute value has / + SE S Cys 1s. Since the sets 


i+ SE for these /’s are disjoint, 


(2(M + 35))" = volume(Cy+1,) > > volume(! + 5) 


all JEL with 
all coordinates <M 


> (2M)"volume(5E) = M”volume(E), 


and therefore volume(£) < (2+ s5/M)'", in contradiction to our extra assumption 
that volume(F) > (2+ €)”. 

Now suppose that volume(Z) = 2”. For each e > 0, let E. be the dilate 
(d+ Se)E . The sets E, satisfy the extra assumption made in the previous part of 
the proof, and therefore E, contains a nonzero lattice point. Since E, is bounded, 
there are only finitely many possibilities for this nonzero lattice point for each 
€ < 1. Thus we can find a sequence of €’s tending to 0 for which this lattice point 
is the same. The convexity of the sets E,, in combination with the fact that the 
sets contain 0, implies that the sets are nested, and therefore this lattice point lies 
in E, for alle > 0. Since E is compact, E = (),., E<, and therefore this lattice 
point lies in E. 


Let us describe the lattice to be used when the Minkowski Lattice-Point The- 
orem is applied to obtain the Dirichlet Unit Theorem. Let Q be the real vector 
space Q = R" x C” = R", and let |w|, be the magnitude of the s“ component 
of w € Q for 1 <s <r, +72. We introduce a homomorphism ® of the additive 
group of K into the additive group of Q given by 


DO) S16 ns pO) Con AIO og, O)) 


for x € K. We shall be mostly interested in the restriction of ® to R, but the 
values on K will help a little with motivation when the Minkowski Lattice-Point 
Theorem is applied once again in the next section. Observe that our definitions 
make ||x||; = |os(x)| = |®(x)|; forx € Kand 1 <s <r, +ro. 


5. Dirichlet Unit Theorem 295 


Lemma 5.17. The image ®(R) is a lattice in Q. 


PROOF. The homomorphism ® is one-one on R because 01, being a field map, 
is one-one. Since R is a free abelian group of rank n and ® is one-one, ®(R) is 
free abelian of rank n. Lemma 5.14 therefore shows that it is sufficient to show 
that ®@(R) is discrete as an additive subgroup of Q. It is enough to show that a 
bounded region of Q contains only finitely many points of ®(R). 

The verification of this fact is similar to an argument in the proof of Proposition 
5.15: A bound by some M on all |o;(@)| for certain elements a € R implies that 
each field polynomial 


det (XJ — (left by w)) = Il (X — 0;(@)) 
j=l 


has all its coefficients bounded by some M’ depending on M. These coefficients 
are integers when a is in R, and thus there are only finitely many such polynomials. 
Each polynomial has at most n distinct roots, and consequently only finitely many 
a’s satisfy such a bound. 


We are now ready to prove Theorem 5.13, but we precede the proof by an 
outline. The proof has three steps to it: 


(1) We apply the Minkowski Lattice-Point Theorem to the set ®(R) C Q, 
which we know is a lattice because of Lemma 5.17. For each so with 1 < so < 
ri +12, let Ey, be a set of w’s in Q defined by the conditions that |w|, is to be 
small for s 4 so and ||, is allowed to be large—with the understanding that 
E;, is a bounded set and that EF, has volume > 2”Vo, where Vo is the volume 
of a fundamental parallelotope of ®(R). Using a nonzero lattice point in ®(R) 
obtained from applying Theorem 5.16 to Es, and squeezing E,, even more, we 
can obtain an infinite sequence of points w in R such that |Nx/q(@)| remains 
bounded and such that the size of this norm is contributed to mostly by |la ||. 

(2) Applying the same argument that was used for quadratic extensions of Q in 
the proof of Proposition 1.16, we obtain infinite sequences of units whose norm 
is contributed to mostly by || - ||s,. We can do this for 1 < so <r} +12. 

(3) We pass to the Log map, proving and applying the following result from lin- 
ear algebra: a real square matrix [a;;] with the property that |a;;| > }° ji |aij| for 
alli is nonsingular. In the application of this result, we have log ||&, ||s, > 0 forthe 
so” constructed unit, log |lEs lls < O for s A so, and an equality that we can write 
either as )7/_, log |lés, lls = 0 or as V1, log |lesylls +2 D547, log [les lls = 0. 
If we drop all terms corresponding to the (r; +r2)" unit, then we are ina situation 
for which the result from linear algebra immediately implies the theorem. 


296 V. Three Theorems in Algebraic Number Theory 


PROOF OF THEOREM 5.13. The proof is carried out in three steps. 
Step 1. For fixed so with 1 < so <r; +72, we construct an infinite sequence 
a) in R with 
. (so) n 
Gi) INxq(aS)| < 2"Vo, 


(ii) Ilo ||; tends to 0 for each s ¥ so as j tends to infinity, 
(iii) lo; 
For the construction, form for each j > 0 the compact convex set in Q closed 
under multiplication by —1 consisting of all @ such that 


me ls) tends to infinity as 7 tends to infinity. 


1 


lols <j for s # so, 
ae Ong 2g 2 Vy ifl<s <n, 
(2) 
so = Oy ig? Vo)!/? ifr| zi 1 <so <rjt+nr. 


This set has volume 
Qf PA 2 Vy 2 = 2% if so <1, 
(2j-))" Gj??? (2" j?-22-"' 2-2 Vo) = 2" Vo if 59 >). 

(So) 


Theorem 5.16 shows that the set contains a nonzero lattice point a; "’. Let us 
check that this point satisfies (i), (ii), and (iii). For (i), we have 
rl ry+tr2 2 
[Noles = (TT log ls)( TT hoes? ts) 
j=l s=ri+l 
OPO ete) ahoen 
AG ery 2 ap). if sg or 
= 2"Vo2"n-” 
< 2 Vo. 


Property (ii) is immediate from the inequality oes ly < j7! for s A so. For 
(iii), we have , 


(so) Cy ay y(50) \( TT? yy6s0) 9 \2- 
1< INik/Q(@; I Zz ( Il Ila; IIs) ( Il Ilo; Ils) ’ 
Jal s=ritl 
thus (ii) implies (iii). 
Step 2. For fixed so with 1 < so <r; +72, we construct an infinite sequence 
of units a such that 
i’) Je s tends to O for each s ¥ So as j tends to infinity, 
j J y. 


(iii’) lew” ls tends to infinity as 7 tends to infinity. 


5. Dirichlet Unit Theorem 297 


For the construction, we pass to a subsequence from Step 1, still denoting it by 
a , such that Nixjq(oty »)) is a constant integer, say M. Since R/(M) is finite, 
we can pass to a further subsequence, still with no change in notation, such that 
all a) lie in the same residue class!’ modulo the principal ideal (M) of R. Put 
(so) __ ,, (0) /,,(0) 
pa fae 
Then Nxjo(a;”) = Nx/o(a\), since Nxo(a”) is a constant integer, and 
wa - a} is in R, since all oe) lie in the same residue class modulo (M). 
The computation 


og a a oy) =i oy) 


(so) J Jd 1 (50) 
e” = 14 =14 I] o(@,;") 
F a M ol 


(so) ; 


shows that ¢;"" is an algebraic integer. Hence it is in R. We certainly have 


Nx/q(o ae M 
Nxyo(e;”) = on =a =k 
Noa,” ) 


Therefore ae is aunit. Also, the computation 
ge, cll ls 
lez lls = Gaur 
lor lls 


shows that (ii) and (iii) in Step 1 imply (ii’) and (iii’) here. 
Step 3. For each so with 1 < so <1 +72, choose j large enough for the unit 
e(s) — 2 in Step 2 to satisfy 
(ii) lle ||, < Lifs £50, 
(iit”) |Je© |. > 1. 


We assert that the vectors Log(e)) for 1 < so < r1 +172 — 1 are linearly 
independent over R. Hence Log(R*) has rank > r; + 72 — 1, and Proposition 
5.15 therefore implies that Log(R*) has rank equal tor) + 72 — 1. 

To verify this assertion, form the square matrix [a;;] of size r; + rz given by 


log le |; ifs fs, 
cal 21log \le [I iffy +1 <j <ritnre. 


'7This conclusion uses a result known as the Dirichlet pigeonhole principle or the Dirichlet 
box principle. 


298 V. Three Theorems in Algebraic Number Theory 


Then a;; > 0 for each i by (iii”), aj; < 0 fori # j by (ii”), and Dy aij = 0 for 
each i because Nx joe) = 1. Let [b;;] be the upper left block of [a;;] of size 
r; +r — 1. For each i, we then have b;; > O and a with ji |bij| < bi. Let 
us prove that the matrix [b;;] is nonsingular. Assuming the contrary, let [c;] be a 
nonzero column vector with 


ye bie; =0 for all i. (*) 
j 


If ig is an index such that |c;,| > |c;| for all j, then setting 7 = ip leads to the 
strict inequality 


ICi:Digiol = ICig|Bigin > lCigl YO |Digg = YD [Bingej| = | XK Binfei 
Ji#io JFio JFio 


’ 


which contradicts (*). Thus [b;;] is nonsingular. 

We conclude that [b;;] has rank r; + r2 — 1. Thus its rows are linearly 
independent, and the first r; + r2 — 1 rows of [a;;] must be linearly independent. 
Therefore the vectors 


(Jog je“ I[1, ..., log le |L,,, 2log je I,41,---, 2Jog lle IL,455), 


indexed by so for 1 < so <r) +r2—1, are linearly independent in R"*”. In other 
words, the vectors Log(e) are linearly independent for 1 < so <7) +r2—1. 


6. Finiteness of the Class Number 


As in Section 5, let K be a number field of degree n over Q, and let R be its ring 
of algebraic integers. Let o1,...,0, be the distinct field maps of K into C, and 
assume that the first r; of them have image in R and the remaining ones come in 
conjugate pairs with 0;,4,,44 = Or,4¢ forl| <k <1rz. 

As in Section I.7, where we treated the case of quadratic extensions, we define 
two nonzero ideals J and J of R to be equivalent if (r)7 = (s)J for suitable 
nonzero elements r and s of R. The same argument as given in that section 
shows that the result is an equivalence relation. The principal ideals form a single 
equivalence class.!® 


'8Section I.7 worked also with a notion of strict equivalence of ideals, but we shall not attempt 
to extend strict equivalence to the present setting. 


6. Finiteness of the Class Number 299 


Proposition 5.18. Multiplication of nonzero ideals in R descends to a well- 
defined multiplication of equivalence classes of ideals, and the resulting multi- 
plication makes the set of equivalence classes into an abelian group. The identity 
element of this group is the class of principal ideals. 


REMARKS. The proofs of this result and of Theorem 5.19 below will use the 
following fact proved in Problems 48-53 of Chapter VIII of Basic Algebra: if I 
is any nonzero ideal in R and if J~! is defined by J~! = {x € K | xJ C Ry}, then 
I~'T = R and there exists r € R with rJ~! equal to an ideal of R. This fact can 
be made to look more beautiful by introducing the notion of “fractional ideal,” 
but we shall not carry out that step at this time.!? 


PRooF. If J is a nonzero ideal, let [7] denote its equivalence class, and define 
[1][J] = LJ]. Suppose that (r)J = (s)I' exhibits an equivalence. Then the 
equality (s)I’J = (r)IJ shows that [/’J] = [JJ]. A similar argument applies 
in the J variable, and therefore multiplication of classes is well defined. It is 
immediate that multiplication of classes is associative and commutative and also 
that the class of principal ideals is an identity. If a class [/] is given, let J~! be 
as in the remarks above, and choose a nonzero r € R such that rJ~! = J is an 
ideal in R. Multiplying by J gives (r) = rU~!1) = (I~!) = JT, and thus 
[J ][/] is the class of the principal ideals. So [/] has an inverse. 


The group of equivalence classes of nonzero ideals as in Proposition 5.18 is 
called the ideal class group of K. Its order is called the class number of K and 
will be denoted by ix. The main theorem of this section is as follows. 


Theorem 5.19. The class number hx of any number field is finite. 


As we shall see in a moment, it is not too difficult at this stage to prove this 
finiteness. However, x is an important invariant of a number field that determines 
whether R is a principal ideal domain, that occurs in various limit formulas in 
the subject, and that occurs also in dimension formulas connected with “Hilbert 
class fields.” It is therefore of considerable interest to be able to compute hx in 
specific examples. For quadratic fields this computation can be carried out by 
the techniques of Chapter I because of the close connection between ideal classes 
and proper equivalence classes of binary quadratic forms. But no comparable 
theory is available as an aid in computation for number fields of degree greater 
than 2. As we shall see, the relatively easy proof of Theorem 5.19 that we give 
in a moment does not offer any helpful clues about the value of hx. The main 


!°The result of the beautification is that the fractional ideals form a group generated by the ideals, 
and the group of equivalence classes is a homomorphic image of the group of fractional ideals. 


300 V. Three Theorems in Algebraic Number Theory 


task of this section will therefore be to provide a better proof of Theorem 5.19 
that helps us find the value of hx in specific examples. 

The two proofs have the following lemma in common. The lemma eliminates 
the notion of equivalence of ideals from the investigation and shows that the 
problem is really that of finding elements in each ideal of relatively small norm. 


Lemma 5.20. For a particular number field K, if there exists a real constant C 
with the property that each nonzero ideal J of R contains an element s 4 0 with 


INx/o(s)| < CNV), 


then each equivalence class of ideals contains a member L whose absolute norm 
satisfies N(L) < C. Consequently the class number hx is at most the number of 
nonzero ideals J in R with N(J) < C. This is a finite number. 


PROOF. Let a nonzero ideal J in R be given. By the remarks with Proposition 
5.18, choose a nonzero element r in R and an ideal J such that r/~! = J. 
Multiplication by J and use of the remarks shows that (r) = JJ. By hypothesis 
for the lemma, choose a nonzero s € J with |Nx/g(s)| < C N(J). Since s is in 
J, (s) is contained in J, and therefore (s) = JL for some ideal L. Multiplying 
both sides of (r) = JI by L gives (r)L = LJI = (s)I/, and L is therefore 
equivalent to 7. Applying Proposition 5.4, we obtain N(J)N(L) = N(JL) = 
N((s)) = |Nxya(s)| < C N(J). Therefore N(L) < C as required. 

Let us now count the ideals J with N(J) < C. In terms of the unique 
factorization 1 = [])_, P of I, we have N(J) > []}_, p%’, where p; is the 
prime number such that P; N Z = (p;). In each case, N(P;) > p;. There are 
only finitely many primes p with p < C, each is associated with only finitely 
many prime ideals P of R with P 1 Z = (p), and P® contributes at least 2° 
toward N (J). The inequality N(J) < C shows that these p’s and their associated 
P’s are the only possible contributors to J and that each exponent is bounded by 
log, N(Z). Hence there are only finitely many possibilities for J. 


Here is the relatively easy proof of Theorem 5.19. 


FIRST PROOF OF THEOREM 5.19. Let x1, ..., X, be a Z basis of R, and express 
members of R in terms of this basis as r = baie, c;x; With all c; € Z. The 
value of Nx/g(r) is the value of the determinant of left multiplication by r on 
K, and this value, as a function of ci,..., Cn, 1s a homogeneous polynomial of 
degree n. Consequently we can find a constant C such that | Nk /0( Sey ciXi) | < 
C max} <j<n |c;|" 

It is enough to show that the condition of Lemma 5.20 is satisfied for this C. 
Thus let an ideal J be given. As each c; runs through the integers from 0 to 


6. Finiteness of the Class Number 301 


N(J)'/", we obtain more than N(J) members r = ch c;x; of R. Since there 
are only N(J) cosets modulo J, at least two of these members of 7, say r; and 
rz, must lie in the same coset.”° Then r; — r2 is a nonzero member of J, it has all 
coefficients between —N(J)!/" and +N(J)!/”, and our construction of C forces 
[Nxo(ri —1r2)| < C(N(VJ)'/")" = C NCJ). 


The second proof of Theorem 5.19 is to combine Lemma 5.20 with the deeper 
and more quantitative estimate given in the following theorem. 


Theorem 5.21 (Minkowski). For any number field K of degree n, each nonzero 
ideal J of R contains an element s 4 0 with 


4\" n! 
INx/a(s)| < (=) * |Dxl'?N(J). 
wh n 


Here rz is half the number of nonreal embeddings of K in C, and Dx is the field 
discriminant. Therefore every equivalence class of ideals contains a member L 
whose absolute norm satisfies 


4\" n! 
nay = (=) — |Dx\'””. 
A n 


We shall prove Theorem 5.21 shortly by applying Minkowski’s Lattice-Point 
Theorem to the lattice ®(J) in Q = R” x C”, where ® is the mapping described 
after the proof of Theorem 5.16. The particular compact convex set in the 
application takes some time to describe, and we return to that matter shortly. 

Meanwhile, let us see a little of the utility of Theorem 5.21. The techniques of 
Chapter I are more useful for computing class numbers for n = 2 than Theorem 
5.21 is, and we therefore consider only n > 3. For n = 3, we must have 
ra < 1. Theorem 5.21 shows that every equivalence class of ideals in R has a 
representative L with 


4 3! 8 
N(L) < = = |Dx|'? = — |Dg|'? < (0.283) |Dx|!”. 
x 33 On 


Problems 1-2 at the end of the chapter give examples of cubic extensions of Q 
whose discriminants are —23, —31, and —44. Since these have (0.283)|Dx|!/* < 
(0.283)7 < 2, the representative ideal in each case must have norm | and must 
be R. Thus for all three of these cubic fields, R is a principal ideal domain. 


20 A gain we are applying the Dirichlet pigeonhole principle. 


302 V. Three Theorems in Algebraic Number Theory 


For the cubic field K = Q(-/2 ), we know from Section 2 that the discriminant 
is Dx = —108. Consequently the estimate shows that every class of ideals has 
a representative with norm < 2. If an ideal J has N(J) = 2, then 2 has to be a 
member, and J divides (2)R. Proposition 5.10d shows that the factorization of 
(2)R is as P? for a certain unique prime ideal P. Thus R and P represent all 
equivalence classes, and hx is | or 2. If there is some r € R with Nxg(r) = 2, 
then P = (r), and the class number is 1; otherwise it is 2. The element </2 has 
INx/o(V/2 )| = 2, and thus P = (/2). Therefore R is a principal ideal domain 
when K = Q(V/2). 

For Dedekind’s example, namely the cubic number field K built from 
X? + X* — 2X + 8, we saw in Section 4 that the discriminant is Dk = —503. 
Then the constant in the estimate is < (0.283)/503 < 6.35. So the interest is in 
ideals of norm < 6. In ruling out ideals that are principal, we need consider only 
prime ideals with norm < 6. Problems 24—32 at the end of the chapter identify 
all the prime ideals of this form and show that they are all principal ideals! We 
conclude that hx = 1, i.e., that the R in Dedekind’s example is a principal ideal 
domain. Not every cubic number field has class number 1, however; Problem 4 
gives an example. 

Before turning to the proof of Theorem 5.21, let us observe the following 
striking consequence. 


Corollary 5.22 (Minkowski). For any number field K of degree n, 


W\'2 n 
Dyl'? > (=) =. 
[Dx 2 4 n! 


Therefore Dk > 1 if n > 2, and there exists at least one prime number that 
ramifies in K. 


REMARKS. With a more general number field F than Q as base field, it can 
happen that no prime ideal ramifies in a certain nontrivial extension field K/F. 
See Problems 5-9 at the end of the chapter. 


PROOF. Set J = R in Theorem 5.21, so that N(J) = 1. The nonzero element 
s must have |Nixg(s)| = 1. The theorem says that (4/2)? (n!/n")|Dx|'/? > 1, 
and this is the displayed inequality of the corollary. Since r2 < $n, (1/4)? > 
(2 /4)"/?, and thus |Dg|!/2 > 27"2"/*n"/n!. Denote the right side of this 
inequality by a,. For n = 2, we have ag = m/2 > 1. Also, Gn41/dn = 
sac + iy" > g!/2, since d+ y" is monotone increasing”! with n and is 
> 2 forn = 2. Hencea, > 1 foralln > 2. By Theorem 5.5 some prime number 
ramifies in K. 


21 To see this monotonicity, expand ad,4) = (1 + a)" anda, = (1+ sy" by the Binomial 
Theorem, and observe that the asserted inequality holds term by term. 


6. Finiteness of the Class Number 303 


We turn to the proof of Theorem 5.21. We again make use of the map 
®:K—> Q=R" x C” =R’ of the previous section. Lemma 5.17 shows that 
®(R) is a lattice in Q, and our interest will be in the sublattice ®(/J), J being the 
nonzero ideal under study. The idea is to consider the set of w € Q for which the 
function 

ry ri +r2 

N@) =(TLlel)( TT le?) 

i=l i=rj+1 
has N(w) < c, c being a positive number. Since N(®(x)) = |Nxq(x)| for 
x € K, the question of finding a member s of J with |Nx/g(s)| < c is the same 
as the question of finding a nonzero lattice point in the set for which N(@) < c. 
Once we sort out how large c has to be for the answer to be affirmative, then 
the inequality of the theorem will result. The tool will again be the Minkowski 
Lattice-Point Theorem (Theorem 5.16), but the difficulty is that the set for which 
N(@) < c is not necessarily convex. 

The nature of the set for which N(w) < c becomes clearer by considering the 
case of K = Q(,/m ) with m > 0. The map © carries x + y./m for x and y in 
Q to the pair (x + y./m, x — y,/m) in R?, and if we parametrize w by the pair 
(x, y), then the set for which N(w) < c is the part of the (x, y) plane containing 
the origin and bounded by the two hyperbolas x? — my? = c and x? —my” = —c. 
This set is not convex, and it is not even bounded. 

Briefly, an individual coordinate of our Q = R” x C”, whether a factor of 
type R or a factor of type C, contributes something compact convex to the set 
for which N(w) < c as long as the other coordinates are fixed, but as soon as 
we allow more than one coordinate to vary, then the product formula defining 
N(q@) produces sets that are neither convex nor bounded. To use Theorem 5.16, 
we want to inscribe a compact convex set within the set for which N(w) < c, 
making the inscribed set contain the origin, be closed under negatives, and have 
volume as large as possible. 

If we were trying to inscribe such a compact convex set in a region cut out by 
two hyperbolas as above, then the best possible set to use would be a rectangle 
with sides parallel to the axes. However, the description above in terms of those 
two hyperbolas used a noncanonical parametrization of elements of Q(./m ) as 
all rational combinations x + y./m. 

Let us proceed for the general case by using only the structure that is given to 
us, without using any noncanonical parametrization. The things that are canonical 
are the factors R and C, the functions || - ||; defined on them, and functions of these. 
For the example above, the function N(@) is given by N(w) = |@|;|@|2. The 
geometric set in R? = {(@, w2)} to consider is changed from above; it is still the 
set toward the origin from two hyperbolas, but the hyperbolas are changed to be 
@|@2 = +c, having the axes as asymptotes. The inscribed convex set becomes the 
set with |w;| + |@2| < 2c!/*. The containment of the latter set in the set toward 


304 V. Three Theorems in Algebraic Number Theory 


the origin from the two hyperbolas follows from the inequality |@,@»|!/? < 
$(lo1| + |@2|), which is a consequence of the inequality i (lo! — ||)? > 0. 
In the general case the inscribed convex set is described in terms of the function 


ry rytr2 
T@) =D loli +2 D0 loli. 
i=l 


i=r,+1 


The set of w with T(w) < t, t being a positive constant, is evidently a compact 
convex set containing 0 and closed under negatives, and the functions T(w) and 
N(q@) are connected by the arithmetic-geometric mean inequality, which says 
that 


1/n 1 
N(@)/" < -—T(@). 
n 


Because of this inequality the set with T(w) < ¢ is contained in the set with 
N(@) < t"/n". 

Since the absolute value in each R or C coordinate is canonical, so is the 
notion of volume, given on rectangular sets by taking products; as usual the 
understanding is that the set in a factor of R on which the absolute value is 
< k contributes a factor of 2k to the volume, and the comparable set in a factor 
of C contributes a factor of 2k”. If Vo denotes the volume of a fundamental 
parallelotope for the lattice ®(J/) in the n-dimensional Euclidean space &, then 
the Minkowski Lattice-Point Theorem says that the set with T(w) < t, and 
therefore also the set with N(w) < t”/n", contains a nonzero lattice point as 
soon as the volume of the set with T(w) < t is > 2”Vo. In other words, as soon 
as the volume of the set with T(w) < ft is > 2” Vo, there exists ans 4 Oin J with 
INxg(s)| < t"/n". 

To prove Theorem 5.21, we therefore need to know two things—the volume Vo 
of a fundamental parallelotope for ®(/) and the volume of the set with T(w) < tf. 
Then we can find the smallest t for which the set with T(w) < t has volume 
> 2” Vo, and we can sort out the details. 

Let us compute the volume Vo. Let T = (a1, ...,@,) be an ordered Z basis 
of the ideal J. The easy case in which to compute Vo is that 7; = n, i.e., that all 
the field embeddings of K into C are real. In this case the discriminant D(I) is 
the determinant of the n-by-n matrix [B;;] with 


n n n 
Bij = Trxo(aiaj) = do og(ajaj) = > ox (aj )ox(aj) = > Aik Aix, 
k=l 


where [Aj;;] is the matrix with Aj; = oj(a;). We recognize | det[A;;]| as the 
volume of a fundamental parallelotope for ®(/), and therefore |D(I°)| = Ve: 
By Proposition 5.1, D(T) = N(J)* Dx, and therefore Vo = N(J)|Dx|!/”. 


6. Finiteness of the Class Number 305 


This answer for the value of Vo is not correct if some of the embeddings of K 
into C are nonreal, since | det[o;(q;)]| no longer equals Vo. To see how to adjust 
matters, suppose that o is a nonreal field mapping of K into C. Then the n-by-n 

£1 


matrix [o;(a;)] contains one column z = | : ] corresponding to o and another 


in 


Z1 


column z = { : | corresponding to @. The entries in the k™ row tell how ay is 


embedded in Q" namely at some point zx, = xx + iy, for o and at Z, = xx — iyg. 
To compute Vo properly, we should have x; in one column and yx in the other, 
instead of z, and zy. We can transform from the matrix with columns containing 
Z, and Z; to one containing x, and y,; by first replacing the first column by the 
sum of the two, which is 2x, = zx + Zz, and by then replacing the second column 
by the difference of the second column and half the new first column, which is 
5 (Zx — Zx) = —iy,z. These operations do not change the determinant. Repeating 
these steps for each of the r2 pairs of nonreal field mappings, we obtain a matrix 
for which the absolute value of the determinant, apart from factors of 2 in rz of the 
columns, is Vo. Consequently Vp = 2~"?| det[o;(@;)]|. Then Vo =2-2|D(T)|, 
and we obtain 
Vo = 2-7 N(J)|Dxl'”?. 


Now let us compute the volume of the set of w in Q for which T(w) < t. Write 
@ = (X1,..-, Xr, Zr 415 +--+» Zry4r,)- The volume is the integral of 1 over the set 
on which |x;| + +++ + |x;,| + 2|z-,41| + 2|z-;4.| < t. The set for the integration 
is invariant under x; +» —x; and under rotation in any variable z;, and hence the 
volume equals 


De (27) / Pri +i es Pri+r2 dx, rs dx, pr 41 os Apri +r, 
E 


where E£ is the set on which all variables are > 0 and 


rytr. 


al 
pe a, ae De ie 
i=l i=r,t+1 


Forr; +1 <i <r, +71, introduce x; = 2¢0;, and make the change of variables. 
Then the volume becomes 


ri—r2 6 
orl aa Xrpet Xp try AX +++ Xp 4455 
Ul 


where E’ is the set of (x1, ..., Xn) in R"*” with allx; > O and with yo xj <t. 
Finally we make a change of variables that replaces each x; by ty;, and the result 
is that 


306 V. Three Theorems in Algebraic Number Theory 


volume({T (@) < t}) = 2° 7?" 1" / Yt Yt A+ + dYr4ry, 
S 


where S is the standard simplex in R"+"? with all y; > 0 and with )7/'"” y; < 1. 
This definite integral is of a standard type that is evaluated by the following 
lemma. 


Lemma 5.23. In R”, let S be the standard simplex with all x; > 0 and with 
ie xi <1. Ifa, ..., Gp» are positive real numbers, then 


— Pa@)r(@)---TGn) 
D(a, +--+ +4m +1) 
REMARKS. The expression I’(- ) is understood to be the usual gamma function, 
whose value at positive integers is given by [(n + 1) = n!. We merely sketch 


the proof; the details can be found in many books that treat changes of variables 
for multiple integrals.” 


a—1 a—-1 Gn—1 
[x Xn? HM dX +++ AXm 


SKETCH OF PROOF. Let J be the unit cube, given byO < u; < lforl <i <™m. 
We make the change of variables x = g(u) that carries the points u of the cube 
I one-one onto the points x of the simplex S$ and that is given by 


X11 = U1, 


x2 = (1—4))u2, 


Xm = (1 — u4)-+- CU — um—1)um. 


The volume element transforms by the absolute value of the Jacobian determinant, 
specifically by 


dx = |g'(u)|du = (1 —u)""'(1 — un)? «(1 = U1) du, 


and the result of the change of variables is that the given integral equals 


m 1 - 
I us" (1 = uj)= Fin dj. 
i=1 70 
The factors here can be evaluated by means of Euler’s formula 


Ped _, T@re) 
a-ly¢y _ ,,yb-1 _ 
i BSNS ere py 


and the lemma follows. 


?2One such is the author’s Basic Real Analysis; the details appear in the problems at the end of 
Chapter VI of that book. Another such book is Rudin’s Principles of Mathematical Analysis. 


7. Problems 307 


For the integral of interest to us, we have m =r) +12, a, =--: =a,, = 1, 
and a,,41 =-++ =4;,4,, = 2. Thusa,;+-+-+a,., =r; +2r2 =n, and we obtain 
; ay 7 
ws 4 eas a i 3 


volume({T (w) < t}) = 2" ?n"t 


Tat+h n! 
Finally we can put everything together. We are to solve for ¢ such that this 


expression is equal to 2” Vo, and then there exists an element s ~ 0 in J with 
INxo(s)| < t"/n". Since Vo = 27” N(J)|Dx|'/”, the equation to solve for t is 


DI—12 772 gh 


= 2"2-?N(J)|Dx|!”. 
Nn: 


Thus ¢” = (4) “n!N(J)|Dx|'2, and the element s ~ 0 in J satisfies 


as 


r2 


4\° n! 
INxyo(s)| < (=) — |Dx|'?N(J). 


n 


This completes the proof of Theorem 5.21. 


7. Problems 


1. Take as known that the discriminant of a cubic polynomial F(X) = X?+ pX+q 
is —(4p? + 27q7). In each of the following cases, let K = Q[X]/(F(X)) with 
F(X) as indicated, and verify that the field discriminant Dx is as indicated: 

(a) F(X) = X3?-—X—1, Dx = —23. 
(b) F(X) = X°4+X+4+1, Dx = 31. 

2. Let K = Q[X]/(F(X)), where F(X) = X? — 2X? +2. 

(a) Use the formula of the previous problem to show that the discriminant of 
the polynomial F(X) is —44. 

(b) Using Proposition 5.2, show that Dx cannot be —11, and conclude that 
Dx = —44. 

3. This problem computes the class number of K = Q( /3). 

(a) Show that every equivalence class of nonzero ideals contains an ideal with 
norm < 4. 

(b) Show that the prime ideals whose norm is a power of 2 are P; = (2, V3 1), 
whose norm is 2, and P) = (2, S94 S34 1), whose norm is 4. 

(c) Show for P; that 2 is a multiple of wos 1, and show for P> that 2 is a 
multiple of SJO+ S341. 

(d) Show that the only prime ideal whose norm is 3 is (3 ). 

(e) Deduce that the class number of K is 1. 


308 


V. Three Theorems in Algebraic Number Theory 


Let R be the ring of algebraic integers in the number field K = Q(</7), and let 
I be the doubly generated ideal J = (2, 1 + 7) in R. 

(a) Prove that N(/) = 2. 

(b) Prove that J is not a principal ideal. 


Problems 5—9 give an example of a nontrivial finite extension L/IK of number fields 
in which no prime ideal for K ramifies in passing to L. By contrast, Corollary 5.22 
says that there always exists a prime that ramifies in passing from Q to a nontrivial 
finite extension. The example has L = Q(/—5, /—1) and K = Q(./—5). Let 
K’ = Q(/5) and K” = Q(/—1). Observe that L/Q is a Galois extension, and so 
are all the various quadratic extensions of L over K, RK’, and K”, as well as of K, R’, 
and K” over Q. The problems make use of the fact that ramification indices multiply 
in passing to an extension in stages, and so do residue class degrees. 


S 


Show that the minimal polynomial of /—1 + /—5 over Q is X* + 12X* 4+ 16, 
and deduce that the elements 5( —1+-+/—5) are algebraic integers in L. 


By making use the formula for D(€) in terms of D(é), where & is an element in 
L, prove that IDG (/—1 + /—5))| = 2457. Consequently Dy, divides 2*5?. 


Verify the following decompositions of the ideals (2) and (5) when extended 
from Z to the rings R, R’, and R” of algebraic integers in K, K’, and K”: 

(a) (2)R = @? with f = 1, and (5)R = go? with f = 1. 

(b) (2)R’ = g with f = 2, and (5)R’ = go with f = 1. 

(c) (2)R” = ©? with f = 1, and (5)R” = g12 with f = 1. 


Let T be the ring of algebraic integers in L. Since L/Q is a Galois extension, the 
only possible decompositions of (p)7T, when p is a prime number, have (e, f, g) 
equal to (4, 1, 1) or (2, 2, 1) or (2, 1, 2) or C1, 4, 1) or C1, 2, 2) or 1, 1, 4). Here 
e is the ramification index, f is the residue class degree, and g is the number of 
distinct prime factors. Using the product formulas for ramification degrees and 
comparing what happens for the passage Q C K’ C L with what happens for 
the passage Q C K” C L, show that the only possibilities for (p)T with p = 2 
and p = 5 are 

(a) (e, f, g) = (2,2, 1) for (2)T, i-e., (2)T = P* with dimp,(T/P) = 2. 

(b) (e, f,g) = (2,1,2) for (5)T, ie. (5)T = PPPS with dimp,(T/Pi) = 

dimp,(T/P2) = 1. 


Return to the situation with Q@ C K C L, where K = Q(./—5). According to 

Problem 7a, the prime decompositions of (2)R and (5)R are (2)R = 93 and 

(5)R = pi. 

(a) Using the results of Problem 8, show that 2T = P and y5T = P| Po, ie., 
§02T is prime, and go5T is the product of two distinct prime ideals. 


7. Problems 309 


(b) Show how to conclude from these facts and from Theorem 5.6 that no prime 
ideal in R ramifies in 7. (Educational note: The field L is the “Hilbert class 
field” of K in the sense of Section 1; the order of the Galois group Gal(L/K) 
matches the class number of K.) 


Problems 10-16 concern the cyclotomic field K = Q(e?”!/?), where p > 2 isa 
prime number. They show that the discriminant is given by Dk = p?~? and that a Z 


basis of the ring R of algebraic integers in K consists of {1, ¢, ¢7,..., ¢?~7}, where 
t= e2tt/P. 
10. Show that K has no real-valued field mappings into C, and deduce that Nx/q(x) 


11. 


12. 


13. 
14. 


15, 


16. 


is positive for every x ~ 0 in K. 

Let F(X) = XP-! 4+ xP-? 4... + 1 be the minimal polynomial of ¢ over Q, 

and let G(X) = F(X + 1). Suppose that k is an integer with GCD(k, p) = 1. 

(a) Prove that G(X) is the minimal polynomial of ‘es — 1, and deduce that the 
norm of ¢* — 1 is given by F(1) = p. 

(b) Why does it follow that Nq/g(d — ck) =p? 

(c) Prove that (1 — chy/a —¢)isaunit of R. 

With notation as in the previous problem, prove that the different D(¢*) of ¢* 

has |D(¢*)| = p/|¢* — 11. 

Deduce from the previous problem that D(¢) = (—1)(?~)-?)/2 pP-?, 


Let A = 1 — ¢. Problem 11b shows that Nx/g(A) = p. Prove that 

(a) the Z span of {1, ¢, c?, eas ce 4) equals the Z span of {1, A, eee Res"), 
(b) an equality p = [][?7} (1 — ¢*) holds. 

(c) there exists a unit ¢ of R such that p = e(1 — ¢)?~! = eaP!. 

Using Problem 14c, prove that the principal ideals (p)R and (A) in R are related 
by (p)R = (A)? 1! and deduce from this fact that (A) is a prime ideal. 

Apply Proposition 5.2 to the Q basis {1, A, A?, ..., 4?~7} of K lying in R to show 
that no factor of p can be eliminated from D(A) = D(¢); take into account the 


highest powers of 4 that divide each term. Conclude that Dx = D(¢) and that 
{1,¢,07,...,¢?7*} is a Z basis of R. 


Problems 17-18 use the same notation as in the text of the chapter: KK is a number 
field of degree n over Q, R is its ring of algebraic integers, Dx is its field discriminant, 
the field mappings of K into C are denoted by o; for 1 < i <n, 1 of the o;’s are 
real-valued, and rz complex-conjugate pairs of the o;’s are nonreal. 


17. 
18. 


Prove that the sign of Dx is (—1)”. 


(Stickelberger’s condition) Let T = (a 1,...,@,) be an ordered n-tuple of 
members of R linearly independent over Q, and suppose that K/Q is a Galois 
extension. Write det[o;(a;)] = P — N, where P is the sum of all the terms of 


310 V. Three Theorems in Algebraic Number Theory 


the determinant corresponding to even permutations and N is the sum corre- 
sponding to even permutations. Using Galois theory, prove that P + N and PN 
are in Z. Then write D(I’) = (det[o;(@;)])? = (P+N)°—4PN, and deduce that 
the integer D(T’) is congruent to 1 or 0 modulo 4. (Educational note: A variant 
of this argument proves the same conclusion about D(I’) without the assumption 
that K/Q is a Galois extension. One makes use of the smallest normal extension 
of Q containing K; this is the splitting field of the minimal polynomial of any 
primitive element of K.) 


Problems 19-23 continue with the notation of Problems 17-18. It is to be proved that 
a suitable localization S~!R of R is a principal ideal domain for which the group of 
units is finitely generated as an abelian group. Let h be the class number of K. 


19. Let ),..., I, be ideals representing all the equivalence classes of ideals in R. 
For each Jj, let uj; be a nonzero element of J;, and put u = u,---up. Define 
S=({l,u, ur,... }. Prove that S-'Risa principal ideal domain. 


20. (a) Prove that if a member a of R divides u* within R for some k > 0, then a 
is aunit in S~'R,ie.,a~! isin S~!R. 
(b) Prove conversely that if a member a of R has the property that au” is a 


unit in S~!R for some m > 0, then a divides u* within R for some integer 
k>0. 


21. Let P;,..., P; be the distinct prime ideals appearing in the unique factorization 
of (uv), and suppose that P} = (bj) for 1 < j <1. Letau™ and k be as in 
Problem 20b, and write uk = ab with b € R. 

(a) Why must each b; necessarily be a unit in S$ “IR? 

(b) Prove that there exist integers nj; > 0 for 1 < j </ such that the element 
d= I]; bi! has (a) = (AP on pe for some integers t; withO < t; < h—-1. 

(c) In this case, why must P;' --- P' be a principal ideal? 


22. Suppose that there are N tuples (e,..., e7) withO < e; < h — 1 forall j such 
that Py ees ee is a principal ideal. For the i such tuple, let the principal ideal 
be denoted by (cj), 1 < i < N. Prove that if k, a, and b are as in the previous 
problem and if the principal ideal in (c) of that problem is (c;), then a = bcje 
for some ¢ in R*. 


23. Conclude from the three previous problems that the group of units of S~!R is 
finitely generated as an abelian group. 


Problems 24—32 complete the discussion in Section 4 of Dedekind’s example of a 
cubic extension of Q with a common index divisor. The field is K = Q(&), where 
—€ is a root of F(X) = X? + X? — 2X + 8, and it was shown in Section 4 that 
D(&) = —2? - 503. Let R be the ring of algebraic integers in K. It will be shown that 
R is a principal ideal domain. 


24. 


29: 


26. 


27: 


28. 


29. 


30. 


7. Problems 311 


Show that n = 4/é is a root of the polynomial G(X) = X? — X* + 2X +8, and 
conclude that 7 is in R. 


(a) By rewriting F(&)/é in terms of € and n, show that €* + € —2+2n =0. 

(b) By rewriting G(n)/n in terms of € and n, show that 2 + 2—n +7? =0. 
Conclude from this formula and (a) that products of € and 7 may be simplified 
according to the table 


GP =-§42-2, nr? =-28-24+y, En =4. 
(c) Using the first formula in (b), deduce the containment of abelian groups 
given by Z({1, §, €7}) © ZI, &, n}). 
(d) Using the first formula in (b), deduce that 7 does not lie in Z({1, &, € AN: 


(e) Conclude from the above facts that {1, €, 7} and be é, 5 (&? + é)} are Z, 
bases of R. 


Let P be a prime ideal in R containing (2)R, write F for the field R/P, let 
gy : R > F be the quotient homomorphism, and let € = g(€) and 7 = g(n). By 
applying ¢ to the table in Problem 25b and using the fact that the additive group 
generated by {1, €, n} is all of R, prove that F has only two elements, i.e., that 
the residue class degree is f = 1, and that the only possibilities for g are the 
following: 


Y= G00 with goo(&)=9, go,0(n) =9, 
g=¢10 with gio(&)=1, gio(m) =9, 
~=901 with g16)=0, goi(y) =1. 


Conversely show that the three functions ¢ 0, ¢1,0, #o,1 defined on € and n in 
the previous problem extend to well-defined ring homomorphisms of R onto Fp. 


Let Po,o, Pi,o, and Po,; be the kernels of the ring homomorphisms in the previous 
problem. Prove that these ideals all have norm 2 and that (2)R = Poo Pi,0Po,1- 


(a) Prove that Poo = (2,8, 7), Pio = (2,€ +1, n), and Po, = (2,€,n +1). 

(b) Exhibit 7 as a member of the ideal (2, € + 1), and show therefore that 
Pio = (2, +1). 

(c) Similarly show that Po; = (2, 7 + 1) and that Poo = (2,§ — n). 

The previous problem exhibited Poo, P1,o, and Po, explicitly as doubly gener- 

ated. In fact, use of the norm map Nx,/g will ultimately show them to be principal 

ideals. 

(a) Show that if H(X) is the field polynomial over Q of an element 6 in K, then 
Nx/o(@) = —H(O) and Nx/o(6 — q) = —H(q) for every g € Q. 

(b) Prove that Nxjo(§) = Nxa() = —8 = —23, that INxo(é + 3)| = 22, 
that |Nixjo(& — 1)| = |Nk/o(& + 2)| = 23, and that |Nxo(é — 2)| = 21. 


312 


31. 


32. 


(c) 


(d) 
(e) 


(f) 


V. Three Theorems in Algebraic Number Theory 


Prove that (€) = Pek L oP. , for unique exponents > 0 whose sum is 3, 
and that (7) = Ppp PP 0 Pos for unique exponents > 0 whose sum is 3. 
Using the fact that 7 = 4, prove thata+a=b+B=c+y =2. 

Using the definitions of Poo, P1,0, and Po, as kernels, prove that b = 0 and 
y =0. 

Conclude that (€) = Po,o Pj. and that (7) = Po, P?o- 


This problem uses the norm computations in Problem 30b. 


(a) 


(b 
(c) 


wm 


(d 
(e) 


wm 


(f) 
(a) 


(b) 
(c) 
(d) 
(e) 


(f) 
(g) 


Using the defining homomorphisms, show that if / is an odd integer, then 
P\,9 contains (§ +1), but Po,9 and Po, do not. 

Show that (€ + 3) = Pj and that ( — 1) = P}. 

Using the defining homomorphisms, show that if / is an even integer, then 
Po,1 contains (€§ +1), but P},9 does not. 

Show that (2, €) = Po.0 Po.1. 

Show that if/ is an even integer not divisible by 4, then la does not contain 
(E +2). 

Show that (€ + 2) = Pj‘yPo,1 and that (§ — 2) = Pj.) Po.t. 

From the identity (§ + 2)Po,o = (€ — 2) that results from Problem 31f, 
deduce that roo = ae is in R and that Poo = (r0,0)- 

Deduce similarly that Pj 9 and Po,; are principal ideals. 

Using Theorem 5.6, show that R contains no ideals of norm 3. 

Using Theorem 5.6, show that the only ideal in R of norm 5 is (5, 1 + &). 
Show that |Nx/g( + &)| = 10, and deduce that (1 + €) = (5,1+ &)P, 
where P is one of the three ideals Poo, Pi,9, and Po. 

Why does it follow that (5, 1 + &) is a principal ideal? 

Prove that R is a principal ideal domain. 


CHAPTER VI 


Reinterpretation with Adeles and Ideles 


Abstract. This chapter develops tools for a more penetrating study of algebraic number theory than 
was possible in Chapter V and concludes by formulating two of the main three theorems of Chapter 
V in the modern setting of “adeles” and “ideles” commonly used in the subject. 

Sections 1-5 introduce discrete valuations, absolute values, and completions for fields, always 
paying attention to implications for number fields and for certain kinds of function fields. Section 1 
contains a prototype for all these notions in the construction of the field Q, of p-adic numbers formed 
out of the rationals. Discrete valuations in Section 2 are a generalization of the order-of-vanishing 
function about a point in the theory of one complex variable. Absolute values in Section 3 are 
real-valued multiplicative functions that give a metric on a field, and the pair consisting of a field and 
an absolute value is called a valued field. Inequivalent absolute values have a certain independence 
property that is captured by the Weak Approximation Theorem. Completions in Section 4 are 
functions mapping valued fields into their metric-space completions. Section 5 concerns Hensel’s 
Lemma, which in its simplest form allows one to lift roots of polynomials over finite prime fields 
F, to roots of corresponding polynomials over p-adic fields Qp. 

Section 6 contains the main theorem for investigating the fundamental question of how prime 
ideals split in extensions. Let K be a finite separable extension of a field F’, let R be a Dedekind 
domain with field of fractions F’, and let T be the integral closure of R in K. The question concerns 
the factorization of an ideal pT in T when p is a nonzero prime ideal in R. If Fy denotes the 
completion of F with respect to p, the theorem explains how the tensor product K @F F'p splits 
uniquely as a direct sum of completions of valued fields. The theorem in effect reduces the question 
of the splitting of pT in T to the splitting of Fy, in a complete field in which only one of the prime 
factors of pT plays a role. 

Section 7 is a brief aside mentioning additional conclusions one can draw when the extension 
K/F is a Galois extension. 

Section 8 applies the main theorem of Section 6 to an analysis of the different of K/F and 
ultimately to the absolute discriminant of a number field. With the new sharp tools developed in the 
present chapter, including a Strong Approximation Theorem that is proved in Section 8, a complete 
proof is given for the Dedekind Discriminant Theorem; only a partial proof had been accessible in 
Chapter V. 

Sections 9-10 specialize to the case of number fields and to function fields that are finite separable 
extensions of F,(X), where Fg is a finite field. The adele ring and the idele group are introduced 
for each of these kinds of fields, and it is shown how the original field embeds discretely in the 
adeles and how the multiplicative group embeds discretely in the ideles. The main theorems are 
compactness theorems about the quotient of the adeles by the embedded field and about the quotient 
of the normalized ideles by the embedded multiplicative group. Proofs are given only for number 
fields. In the first case the compactness encodes the Strong Approximation Theorem of Section 8 
and the Artin product formula of Section 9. In the second case the compactness encodes both the 
finiteness of the class number and the Dirichlet Unit Theorem. 


313 


314 VI. Reinterpretation with Adeles and Ideles 
1. p-adic Numbers 


This chapter will sharpen some of the number-theoretic techniques used in 
Chapter V, finally arriving at the setting of “adeles” and “ideles” in which many 
of the more recent results in number theory have tidy formulations. Although 
Chapter V dealt only with number fields, the present chapter will allow a greater 
degree of generality that includes results in the algebraic geometry of curves. 
This greater degree of generality will not require much extra effort, and it will 
allow us to use each of the subjects of number theory and algebraic geometry to 
motivate the other. 

The first section of Chapter V returned to the idea that one can get some 
information about the integer solutions of a Diophantine equation by considering 
the equation as a system of congruences modulo each prime number. However, 
we lose information by considering only primes for the modulus, and this fact 
lies behind the failure of Chapter V to give a complete proof of the Dedekind 
Discriminant Theorem (Theorem 5.5). The proof that we did give was of a related 
result, Kummer’s criterion (Theorem 5.6), which concerns a field Q(€), where 
€ is a root of an irreducible monic polynomial F(X) in Z[X]. The statement of 
Theorem 5.6 involves the reduction of F(X) modulo certain prime numbers p 
and no other congruences. 

The Chinese Remainder Theorem tells us that a congruence modulo any integer 
can be solved by means of congruences modulo prime powers, and the formulation 
of Theorem 5.6 uses only congruences modulo primes raised to the first power. 
Let us strip away the complicated setting from such congruences and see some 
examples of how the use of prime powers can make a difference. 


EXAMPLES. 


(1) Consider the problem of finding a square root of 5 modulo powers of 2. 
For the first power, we have 


x —5=(x— 1) +2x —6 = (x — 1)* mod 2, 


i.e., x2 — 5 is the square of a linear factor modulo 2. For the second power, the 
computation is 


5] 6.2 DO41)=4S 6 — DG +) mod4, 


and x? — 5 is the product of two distinct linear factors modulo 4. For the third 
power, x” — 5 is irreducible modulo 8 because the only odd squares modulo 8 are 
+1. Thus the polynomial x” —5 exhibits a third kind of behavior when considered 
modulo 8. For higher powers of 2, the irreducibility persists because a nontriv- 
ial factorization modulo 2* with k > 3 would imply a nontrivial factorization 
modulo 8. 


1. p-adic Numbers 315 
(2) Consider the problem of finding a square root of 17 modulo powers of 2. 
We readily compute that 

eT Se 1 he S18 Se od 2: 

x? —17 =(x —1)(x +1) — 16 = (« — I(x 4+ 1) mod 4, 

x? —-17 =(x —1)(x +1) —16 = (« — 1I)(x + 1) mod 8, 

x? —17 = (x —1)(x +1) — 16 = (« — 1)(x + 1) mod 16, 

Pel SOHNE D432 Sie= 7)(x +7) mod 32, 

x? —17 = (x —9)(x +9) + 64 = (x — 9)(x +9) mod 64, 


i.e., that the factorization of x? — 17 begins in the same way as for x” — 5 but that 
x? — 17 continues to factor as the product of two distinct linear factors modulo 
23, 2+, 2°, and 2°. We can argue inductively that this pattern persists through all 
higher powers. In fact, suppose that x? — 17 = (x — m)(x +m) mod 2* for an 
integer k > 3. Then 

x* —17 =x? — Mm’ +a2*, 


and m must be odd. Then we can write 
foi Set = eal oa Sa ae): 


The factor (1 — m + a2‘~*) is even, and this equality shows that x? — 17 is the 
product of two distinct linear factors modulo 2‘*!, This completes the induction. 


One immediate observation from the two examples is that the factorizations 
of x” — 5 and x” — 17 are the same modulo 2 and modulo 27 but are qualitatively 
distinct modulo higher powers of 2. Another observation is the nature of the data 
produced by the inductive argument in Example 2: For each k, we obtain an odd 
integer m, such that m: = 17 mod 2‘, and the m,’s are constructed in such a 
way that my41 = mz — apo af m: = 17+ 4,2". It follows that if / > k, then 
mx —mr is divisible by 2‘~, i.e., by higher and higher powers of 2 as k increases. 

A first conclusion is that we get additional information by using congruences 
modulo prime powers. A second and more subtle conclusion is that it would be 
desirable to regard the sequence {m,;} as stabilizing in some sense; then we could 
regard the system of congruences modulo all powers 2* as having a single pair 
of solutions that we can consider as square roots of 17. In this case we would 
not have to think about infinitely many solutions to infinitely many unrelated 
congruences. 

The construction that is to follow in this section, which is due to K. Hensel, 
will capture this information as a single “2-adic number.” Conversely the 2-adic 
number carries with it the congruence information modulo 2* for all positive 
integers k. 


316 VI. Reinterpretation with Adeles and Ideles 


Thus the revised method of considering congruences prime by prime will be 
a two-step process, first a step of “localization” and then a step of “completion.” 
In our application in Chapter V, we did not explicitly make use of localization 
in the sense of Chapter VII of Basic Algebra, but it was there implicitly —in 
Proposition 5.2 for example and in the proof of Theorem 5.6. Carrying out the 
details of setting up the theory behind the two-stage process will take some work 
and will occupy the first four sections of this chapter. Let us get started. 

Let p be a prime number. We define a real-valued function | - | eu the field 
Q of rationals as follows: we take |0| p= 0, and for any rational r = p™ab7! 
with a and b equal to integers relatively prime to p, we define |r|, = p~”. The 
function | - |, is called the p-adic absolute value on Q. It has the following 
properties: 


(i) |x| we 0 with equality if and only if x = 0, 
Gi) be + yl < max({slp, [ylp). 
(iii) Ixy|, = Ixl,Iylp, 
(iv) |— 1], = [1], = 1, and 


(v) |—x1, = lL, 
In fact, with (41), equality holds if Ix|, # ly, and the case with Ix|, = Mle 
comes down to the observation that ¢ + 5 = at < has no factor of p in its 


denominator if b and d are relatively prime to p. Property (111) comes down to the 
fact that if a, b, c, d are relatively prime to p, then so are ac and bd. The other 
properties follow from the first three: To see that |1|, = 1 in (iv), we observe 
from (iii) that |1| E is a nonzero solution of x? = x and thus has to be 1. This 
conclusion and (iii) together show that | — 1|, is a positive solution of x? = 1 and 
thus has to be 1. Property (v) follows immediately by combining (iii) and (iv). 

Inequality (ii) is called the ultrametric inequality. It implies that |x + y|, < 
|x| aor ly| en and consequently the function d(x, y) = |x—y| 5 satisfies the triangle 
inequality 

d(x, y) < d(x, z) +d(, y). 


Since (i) shows that d(x, y) > 0 with equality exactly when x = y and since (v) 
implies that d(x, y) = |x — y|, = d(y, x), the function d on Q x Qis a metric. 
It is called the p-adic metric on Q. 

The field Q,, of p-adic numbers will be obtained by completing this metric and 
extending the field operations to the completion. Let us see to the details. Regard 
the space TT Q of sequences {q;}?2 of rational numbers as the direct product 
of copies of the ring Q, the operations being taken coordinate by coordinate. 
Then Wy Q is a commutative ring with identity, the identity being the sequence 
whose terms are all equal to 1. 


1. p-adic Numbers 317 


As is usual for metric spaces, we say that a sequence of rationals, i.e., a member 
{qj} of Was Q, is convergent to g € Qin the p-adic metric if for any real e > 0, 
there exists an integer N such that |qn — q|, < € for alln > N. Convergence 
in this metric is quite different from what one might expect; for example the 
sequence {2/ Yea is convergent to 0 when p = 2. The sequence {g;} is a Cauchy 
sequence in the p-adic metric if for any real € > O, there exists an integer N 
such that |¢m — dnl, < € for allm > N and alln > N. Convergent sequences 
are Cauchy, as follows from the inequality |g¢m — qy| p 2 ldm — G1, +14 — Qnlp- 
Cauchy sequences need not be convergent, but every Cauchy sequence {q,} is 
bounded in the sense that there is some real C with |q,| as C for all n. 


EXAMPLE 2, CONTINUED. We obtained a sequence {m,} of odd integers such 
that / > k implies that m, — m, is divisible by 2‘~! and mz — 17 is divisible by 2*. 
In terms of the 2-adic absolute value, |m, —m) lp < 2—*-D and |mé — 17|, Ce aa 
The sequence {m ;} is therefore a Cauchy sequence in the 2-adic metric, and the 
sequence {m7} is convergent in the 2-adic metric to 17. 


It follows from the ultrametric inequality that the sum and difference of Cauchy 
sequences is bounded, and (ii) and the boundedness of Cauchy sequences implies 
that the product of two Cauchy sequences is Cauchy. Therefore the subset 7 of 
Cauchy sequences is a subring with identity within Tz Q. 

In the theory of metric spaces, one defines a suitable notion of equivalence of 
Cauchy sequences, and the set of equivalence classes becomes a complete metric 
space,! any member q of Q being identified with the constant Cauchy sequence 
whose terms all equal g. With the p-adic metric, one can then prove that the field 
operations extend to the completion, and the completion is the field of p-adic 
numbers. This verification is a little tedious when done directly, and we can 
proceed more expeditiously by using some elementary ring theory. 

Since convergent sequences are Cauchy, the set Z of sequences convergent to 0 
is a subset of the ring 7. The sum or difference of two such sequences is again 
convergent to 0, and Z is an additive subgroup. We shall show that Z is in fact 
an ideal in ®. Thus let {z,} be convergent to 0, and let {g,} be Cauchy. Since 
{qn} is Cauchy, it is bounded, say with |qp| re M for all n. If € > 0 is given, 
choose N such that n > N implies |zn| Pea 2 /M. Thenn > N implies that 
IZn9nlp = lZnlpldnlp < (€/M)M = «€. Hence {ZnGn} is convergent to 0, and Z is 
an ideal in 7. 


Proposition 6.1. With the p-adic absolute value imposed on Q, let R be the 
subring of Tz: Q consisting of all Cauchy sequences, and let Z be the ideal in 


'This construction is carried out in detail in Section II.11 of the author’s Basic Real Analysis. 


318 VI. Reinterpretation with Adeles and Ideles 


R consisting of all sequences convergent to 0. Then Z is a maximal ideal in ?, 
and the quotient 7? /Z is a field. Consequently the Cauchy completion of Q in the 
p-adic metric is a topological field Q » into which Q embeds via a field mapping. 
If | - | . denotes the function d(- , 0) on Q,, then | - | : is a continuous extension 
of the p-adic absolute value from Q to Q,, and it satisfies 


(a) |x| Pes 0 with equality if and only if x = 0, 
(b) |x + yl, < max(|x],,, |y|,), and 
(c) Ixylp = Blplylp- 
The subset ZL, — {x €Q, | Ix|, < 1} is an open closed subring of Q, in which 


Z is dense, and Z, is compact. Consequently the topological field Q, is locally 
compact. 


REMARKS. The field Q, is called the field of p-adic numbers, and the ring 
Zy is called the ring of p-adic integers. The ring Z, contains the identity of Q,. 


PROOF. First let us prove that Z is a maximal ideal. Arguing by contradiction, 
let {g,} be a Cauchy sequence that is not in Z, i.e., is not convergent to 0. Then 
there exists an €) > 0 such that |q,,| p = €0 for infinitely many n. Choose N such 
that |gn — dm| < €0/2 whenever n > N andm > N, and find some ng > N with 
Ino | p= £0: Then n > N implies that |qn| pee e0 /2 because otherwise we would 
have €9 < |dnol, < ln — Anolp + lanl, < €0/2 + €0/2 = €0, contradiction. Let 
{rn} be the sequence withr, = 0 forn < N andr, = Get forn > N. Forn > N 
and m > N, we have 


ltn —tmly = Ida’ — Gm Ip = |(@m — on)/ mn) |p 


= |4m — Qnlpl4mly ‘ldnlp) < 4€9 lam — Gnlp» 


and it follows that {r,} . is Cauchy and hence lies in R. Since Z is an ideal in R, 
{rngn} is Cauchy. The terms of the sequence {r,g,} are all equal to 1 forn > N, 
and hence {r;q,} differs from the identity of 7 by a member of Z. Consequently 
the identity is in Z. This is a contradiction, since the members of the constant 
sequence {1} are at distance |1 — 0| >= 1 from 0. Hence Z is a maximal ideal, 
and ?/TZ is necessarily a field. 

Meanwhile, the Cauchy completion Q,, of Q is the set of equivalence classes 
from , two members of R being equivalent if they differ by a sequence conver- 
gent to0. Consequently the Cauchy completion Q,, is precisely /Z as a set. The 
mapping Q > Rk — R/T carrying a member gq of Q to the constant sequence 
{qn} with all g, = g and then from R to the quotient R/Z = Q, evidently respects 
the operations and hence is a field mapping. This mapping identifies Q with a 
subset of Q,. The metric d on Q extends uniquely to a continuous function on 


1. p-adic Numbers 319 
the completion Q, x Q,, and therefore the p-adic absolute value | - |, = d(-, 0) 
extends to a continuous function on Q,. 

Property (a) for the function | - | p on Q, follows from the fact that the 
continuous extension of d is a metric on Q,. To see that (b) and (c) hold on 
Q,, let x and y be members of Q, = R/T, and let {q,} and {r,} be respective 
coset representatives of them in R. Then {q, +7,} and {g,r,} are representatives 
of x + y and xy by definition, and the continuity of the p-adic absolute value on 
Q, implies that lim, |gn + Talp = |x + Yip and lim, IGn? nlp —— Ixy). From the 
first of these limit formulas and from (b) on Q, we obtain 


Ip 


lx + yl) = limsup Igy + ray < lim sup max(Ignlp; Ply) = max(l lp» Llp) 


since lim, ldnlp = Ixlp and lim, Iralp = |ylp- This proves (b) on Q,. Similarly 
xy]p = lim garaly = lim Galplaly = Cim |gal,) lim nlp) = lx lply lp» 


and this proves (c) on Q,. 

To see that addition, subtraction, and multiplication are continuous on Q, x Q,, 
let {x,} and {y,} be convergent sequences in Q, with respective limits x and y. 
Use of (b) on Q,, gives 


(Xn + Yn) — H+ Wp = 1On — *) + On — Wp S Max(lXn — X|,, Yn — Ylp)- 


The right side has limit 0 in R, and therefore x, + y, has limit x + y inQ,. A 
completely analogous argument, making use also of the equality | — 1], = |1],, 
shows that subtraction is continuous. Consider multiplication. If M is an upper 
bound for the absolute values |x, | y and | yp | ie then use of (c) on Q, gives 


Xn Yn — xYlp = |XnOn — Y) + Yn — x)|p 

< max([Xn(Yn — Y)Ips Yn — Ip) 

— max(|Xn1p1¥n 7 Yip» ly|plXn =, X|p) 

< max(M|yn — yp; [lpn — Xp). 
The right side has limit 0 in R, and therefore x, y, has limit xy in Q,. 

To see that inversion x +> x7! is continuous on Q>: let {x,,} be a sequence in 

Q with limit x in Q.- Since lim, |x,| p = |x|)», we can find an integer N such 
that |x, | ee 5 |x| . forn > N. The computation 


==] = -1 
Ix, —x be = \(x — Xn)/(%nX) |p = |x — Xnlp/(UXnlpl¥1)) = 2\x|, |x — Xn p? 


320 VI. Reinterpretation with Adeles and Ideles 


' — x! and inversion is continuous. Conse- 


valid for n > N, shows that lim x, 
quently Q,, is a topological field. 

It follows immediately from properties (b) and (c) and from the equality 
|— x\, = Ix|, that Zp is a subring of Q,. Since Z, is defined in terms of a 
continuous function and an inequality, it is closed. It can also be defined as 
the subset with |x|, < p because the p-adic absolute value takes no values 
between | and p, and therefore Z, is open. The most general nonzero member 
of Q/N Z, is of the form g = a/b, where a and b are relatively prime nonzero 
integers with |a/D|, < 1. Here ld|, = 1, and p cannot divide b. If k > 0 is 
given, then it follows that there exists n with bn — a = 0 mod p*. This n has 
ln — Fly = |bn — al, < p~*. Sogq is in the closure of Z in Q,. In other words, 
the closure of Z contains QM Z,. Since Q is dense in Q,,, Z is dense in Zp. 

For each integer n > 0, the set Z, is covered by the closed balls of radius 
p " centered at the integers 0, 1,2,..., p” — 1. In fact, every integer z has z = 
k mod p” for some integer k € {0, 1, 2,..., p” — 1}. For this k, |z —kl, <p: 
Thus Z is contained in the union of the closed balls of radius p~” centered at 
0,1,2,..., p" — 1. This union is closed; since Z is dense in Z,, Zp is contained 
in this union. In turn, these closed balls are contained in the open balls of radius 
pot! centered at the integers 0, 1,2,..., p” — 1. Thus for any positive radius, 
there exists a finite collection of open balls of that radius or less such that the 
union of the open balls covers Z,. This means that Z, is totally bounded in the 
metric space Q,. A totally bounded closed subset of a complete metric space is 
compact, and consequently Z, is compact. 

Thus the 0 element of Q, has Z, as a compact neighborhood. Since addition 
is continuous, x + Z, is acompact neighborhood of x, and therefore Q,, is locally 
compact. 


2. Discrete Valuations 


The construction of the p-adic absolute value on Q seemingly made use of unique 
factorization of the members of Z, but actually the unique factorization of the 
ideals in Z would have been sufficient. Thus we shall see in a moment that the 
construction extends to apply to any number field F as soon as we specify a 
nonzero prime ideal P in the ring R of algebraic integers of F. In fact, there 
is nothing special about a number field. If R is any Dedekind domain and F is 
its field of fractions, then the construction extends to F as soon as we specify a 
nonzero prime ideal P in R. 

Before describing the extended construction, let us look at the definition of 
the p-adic absolute value on Q more closely. Recall that if x = p”ab7! for 
integers a and b relatively prime to p, then |x|, = p ”. Actually, the base p 
in this exponential is not very important at this point, and we could have used 


2. Discrete Valuations 321 


any real number r > | in place of p in p~”. With this adjustment the p-adic 
absolute value would have been given by |x|, =r~"? where v,(x) is the exact 
net power of p that occurs when the prime factorizations of the numerator and 
denominator of x are used. The exponent v,(x) is what is important; the base r 
is unimportant. 

The expression v,(x) for Q is analogous to the order of vanishing of a poly- 
nomial in one complex variable at a point, and Hensel was led to the p-adic 
absolute value by carrying the notion for C[X] to the setting with Q. In setting 
up a generalization, we shall work first with the generalization of the order of 
vanishing v,(x), since it is the more primitive notion, and in Section 3 we shall 
exponentiate to obtain a generalization of the absolute value for which we can 
form a completion. 

To make the definitions, it is convenient to make use of fractional ideals, which 
were the subject of a set of problems in Chapter VIII of Basic Algebra. Let us 
recall the definition and the relevant properties. Again let R be a Dedekind 
domain, and let F be its field of fractions. A fractional ideal of F is any finitely 
generated R module M. For such an R module, there exists some a € R with 
aM C R, and then aM is an ideal of R. If M is any nonzero fractional ideal, 
then M~! = {x € F | xM € R} isa nonzero fractional ideal, and MM~! = R. 
With this definition and property, it readily follows from the unique factorization 
of ideals in R that any nonzero fractional ideal M of F is of the form 


1 
M = I] Bi 
j=l 


fora suitable set {P,,..., P;} of distinct nonzero prime ideals of R and for suitable 
nonzero integer exponents k;. This expansion is unique up to the order of the 
factors, and every such expression is a fractional ideal. It follows that the nonzero 
fractional ideals form a group under multiplication. At the end of this section, we 
shall mention how this group is related to the ideal class group of F as defined in 
Section V.6. 

If x ~ 0 is in F, then the principal fractional ideal (x) = xR has a 
factorization as above. If P is a nonzero prime ideal of R, we let vp(x) be 
the negative of the integer exponent of P in the prime factorization of (x). For 
example, if x is a nonzero element of R, then vp (x) is a nonnegative integer. To 
make up(-) be everywhere defined on F,, we define vp (0) = +00. Then vp(-) 
is function from F onto Z U {+00} such that 

(i) up(x) = +00 if and only if x = 0, 
(ii) vp(x + y) => min(vp(x), vp(y)) for all x and y, and 

(iii) up(xy) = vp(x) + vp(y) for all x and y. 

We shall see in Proposition 6.4 below that the effect of vp(-) is to pick out from 
F the localization of R at P. 


322 VI. Reinterpretation with Adeles and Ideles 


To proceed further, we abstract the above construction and see what informa- 
tion we can recover from it. Let F be any field. A discrete valuation of F is a 
function v(-) from F onto Z U {oo} such that 

(i) v(x) = +00 if and only if x = 0, 
Gi) v(x + y) = min(v(x), v(y)) for all x and y, and 

(iii) v(xy) = v(x) + vQy) for all x and y. 

Observe as a consequence that 
(iv) v(—1) = v1) = 9, 
(v) v(—x) = v(x) for all x, and 
(vi) v(x + y) = v(x) if v(y) > v(x). 
In fact, v(1) = 0 follows by taking x = y = 1 in (iii), and then v(—1) = 0 
follows by taking x = y = —1 in (iii). This proves (iv), and (v) follows by 
combining (iv) with (iii) for x = —1. For (vi), we have v(x + y) > v(x) by 
(ii). In the reverse direction, v(x) > min(v(x + y), v(y)) by (ai) and (v); since 
v(y) > v(x), the minimum must be the first of the two, and thus v(x) > v(x+y). 

Define R, = {x € F | v(x) => 0}. Property (i) shows that 0 is in R,, (ii) and 
(v) show that R, is closed under addition and subtraction, (111) shows that R, is 
closed under multiplication, and (iv) shows that 1 is in R,. Consequently R, is 
an integral domain. The ring R, is called the valuation ring of v in F. 

If x isin F but is notin R,, then v(x) < 0. This inequality forces v(x!) > 0, 
and x! is in R,. As a consequence, F can be regarded as the field of fractions 
of Ry. 

Let P, = {x € F | v(x) > O}. Arguing in similar fashion, we see that P, is 
an ideal in R,. Any x in R, that is not in P, has v(x) = v(x~!) = 0 and is thus 
a unit in R,. In other words, R, is a local ring with P, as its unique maximal 
ideal. The ideal P, is called the valuation ideal of v in F. We write k, for the 
field R,/ Py; it is called the residue class field of v. 


Proposition 6.2. Let v be a discrete valuation of a field F, let R, be the 
valuation ring, and let P, be the valuation ideal. Then 

(a) Ry, is a principal ideal domain, 

(b) there exists an element z in P, with v(z) = 1, and any such z has 
Py = (a), 

(c) the nonzero ideals of R, are exactly the nonnegative integer powers of P, 
and are given by P!’ = (z”) = {x € R, | v(x) = n} forn > 0, 

(d) the nonzero fractional ideals of R, are exactly the integer powers of P, 
and are given by P” = (") = {x € R, | v(x) = n} forn € Z. 


REMARKS. When F equals Q and v counts the net power of a prime number 
p dividing a rational number, we see by inspection that the ring R, is the local- 
ization of Z at p, consisting of all rational numbers with no factor of p in their 


2. Discrete Valuations 323 


denominators. The choices” for z in (b) are the elements rp, where r is any 
nonzero rational whose numerator and denominator are both prime to p, and the 
nonzero ideals are of the form (p”) with n > 0. 


PROOF. The ideal P, contains an element wz with v(z) = | because v(-) is 
assumed to be onto Z U {+00}. Suppose that x is a nonzero member of P, and 
that v(x) =n > 0. Then v(x~"x) = 0, and the elements 7~"x and x~!z” lie in 
R,. Hence x = 1"(a~"x) exhibits x as a member of (77”), and 2” = x(x~!n") 
exhibits 7” as a member of (x). Consequently (x) = (z”). If J is a nonzero 
proper ideal in R,,, then it follows that J = m”° R,, where no is the smallest integer 
such that some element xo of J has v(xo) = no. This proves (a), (b), and (c). 

Since R, is a principal ideal domain, it is a Dedekind domain, and the theory of 
fractional ideals is applicable. Since (c) shows the nonzero ideals to be all P,” with 
n = 0, it follows that the fractional ideals are all P.” with n an arbitrary integer. 
For any integer n > 0, we have (1") P? =a "Ry" Ry = Ry = Py" Pi’, and 
thus P,” = (a ~"). The latter ideal equals 7" Ry, = {x € Ry | v(x) => —n}, and 
this proves (d). 


From property (vi) it follows for n > 0 that the members x of the set 1+ P/” all 
have v(x) = 0. The product of two such elements is again in the set because P?’ 
is an ideal. Let us see that the multiplicative inverse x! of a member x of the set 
is in the set. We calculate that v(x~! — 1) = v(x7!) +u(1—x) =0+v(1—x) = 
v(1—x) > n. Hence x~! isin 1+ P”, and 1+ P” is a group under multiplication. 
It is a subgroup of the group R* of units in Ry. 


EXAMPLE. When F = Q and v counts the net power of a prime number 
p dividing a rational number, the residue class field k, has p elements, with 
the integers 0,1,..., p — 1 being coset representatives. The group R> is the 
multiplicative group of rationals having numerators and denominators prime to 
p. The members of 1 + P,” are rationals of the form 1 + p"ab™', where a and b 
are integers and b is prime to p. If we write this as b~!(b + p"a), we see that the 
condition on a rational to be in 1 + P,’ is that its numerator and denominator be 
prime to p and be congruent to each other modulo p”. 


Now we return to our first example of a discrete valuation, which was con- 
structed from a nonzero prime ideal P in a Dedekind domain R. We called the 
valuation vp(-). We asserted earlier that the construction via vp(-) picks out 
the localization of R at P and the associated data. This assertion will be proved 
in Proposition 6.4 below. We begin with a handy lemma. 


Some books use the term “uniformizer” or “uniformizing element” for any generator z of the 
principal ideal P,. The generators are exactly the prime elements of the ring Ry. 


324 VI. Reinterpretation with Adeles and Ideles 


Lemma 6.3. Let R be a Dedekind domain regarded as a subring of its field of 
fractions F’, let P be a nonzero prime ideal in R, and let vp be the valuation of F 
defined by P. Then any element x of F with vp(x) = 0 is of the form x = ab! 
with a and b in R and vp(a) = up(b) = 0. 


PROOF. If x is an element of F with vp(x) = 0, write x = a’b’~! witha’ € R 
and b’ € R. Then vp(a’) = vp(b’) = n for some integer n > 0. Since a’ and b’ 
are in R, (a’) and (b’) are ordinary ideals, and their prime factorizations are into 
ordinary ideals. Let the factorizations be (a’) = P” Q, and (b’) = P” Qo, where 
Q, and Q> are products of prime ideals not involving P. Since we are dealing 
with ordinary ideals, a’ and b’ lie in P”. Choose an element z in the fractional 
ideal P~" that is not in P~"*!. By definition of P~”, zP” is contained in R. 
Hence za’ and zb’ lie in R. Write (za’) = P’Q3 and (zb’) = P™ Q4, where 
m > 0 and where Q3 and Q, are ordinary ideals whose prime factorizations do 
not involve P. Substituting for (a’), we obtain (z)P”Q,; = P’”Q3 and hence 
(z)P” = P™Q;3 OF. From this expression we see that Q3 Q;' is an ordinary 
ideal. By definition of P-"+! (z)P”~! is not contained in R. Since (z)P”~! = 
P™-'0Q, Or; it follows that m = 0. Similarly m’ = 0. Consequently vp (za’) = 
vp(zb’) = 0, and the lemma follows with a = za’ and b = zb’. 


Proposition 6.4. Let R be a Dedekind domain regarded as a subring of its 
field of fractions F, let P be a nonzero prime ideal in R, and let vp(-) be 
the corresponding valuation of F. If S denotes the multiplicative system in R 
consisting of the complement of P and if the localization S~'R is regarded as a 
subring of F, then the valuation ring R,, coincides with S~!R and the valuation 
ideal P,, coincides with S~! P. 


PROOF. The set S consists exactly of the members x of R with vp(x) < 0. 
Since vp is nonnegative on R, these are the members x of R with vp(x) = 0. 
Thus each x in S~!R has vp(x) > 0, and S~!R is a subset of Ryp- 

For the reverse inclusion, fix a member z of P that is not in P?. This element 
has vp(zr) = 1. If x is given in Ry, with vp(x) = n > O, then we can write 
x = m"u for some member u of F with vp(u) = 0. By Lemma 6.3 we can 
decompose u as u = ab-! with a and b in R and vp(a) = vp(b) = 0. The 
members of R on which vp takes the value 0 are exactly the members of S. Thus 
u is exhibited as the quotient of two members of S, and u is in S~'R. Since z is 
in the ideal P of R, x =7"u is in S~'R. Hence R,, = S-!R. 

The ideal S~'P is a maximal ideal of S~!'R = R,,, and we observed just 
before Proposition 6.2 that P,, is the unique maximal ideal of R,,. Therefore 
ie ae ee 


Let us investigate the nature of an arbitrary discrete valuation in various settings 
involving a Dedekind domain. The main general result of this section is as follows. 


2. Discrete Valuations 325 


Theorem 6.5. Let R be a Dedekind domain regarded as a subring of its field 
of fractions F’, and let v be a discrete valuation of F such that R C R,. Then 


(a) P = RN P, is a nonzero prime ideal of R, 

(b) the associated discrete valuation vp defined by P coincides with v, 

(c) PR,» = Py, 

(d) R+ Py = Ry, and in fact R + P” = R, for every integer n > 1, and 
(e) the inclusion of R into R, induces a field isomorphism R/P = R,/ Py. 


PROOF. Since | is notin P,, the ideal P in (a) is proper. If a and b are members 
of R such that ab is in P, then ab is in P,, one of a and b is in P, as well as R, 
and P = RN P, is a prime ideal. The ideal P cannot be 0 because otherwise 
every nonzero element x of R would have v(x) = 0, in contradiction to the fact 
that F is the field of fractions of R. Thus P is a nonzero prime ideal of R. This 
proves (a). 

For (b) and (c), let us begin by showing that vp(x) = 0 implies v(x) = 0. By 
Lemma6.3 wecan write x = ab~! witha and bin R and with vp(a) = vp(b) = 0. 
The values of vp show that the members a and b of R are not in P. Since 
P = RN Py, neither a nor D is in P,. Therefore v(a) < 0 and v(b) < 0. 
Since R C R, by assumption, v(a) > 0 and v(b) > 0. We conclude that 
v(a) = v(b) = O and that v(x) = v(ab~!) = v(a) — v(b) = 0. 

Now we can show that v = vp and that PR, = P,. The ideal PR, of R, has 
to be of the form P* for some integer e > 0 by Proposition 6.2c, and the integer 
e has to be > 0 because | is not in PR,. If anonzero x € R has vp(x) = n for 
some integer n > 0, then xR = P”Q, where Q is an ideal of R whose prime 
factorization does not involve P. The function vp is 0 on Q, and the result of the 
previous paragraph shows that v is 0 on Q. Hence the members of Q are units in 
R,, and OR, = R,. Therefore xR, = xRRy = POR, = P"R, = (PR,)" = 
Pe" and u(x) = en = evp(x). Since F is the field of fractions of R, v = evp 
everywhere. The image of vp is ZU {+00}, and we conclude that e = 1. In other 
words, v = vp and PR, = P,. This proves (b) and (c). 

For the first conclusion in (d), we certainly have R + P, C Ry. In the reverse 
direction, let x € R, be given. If v(x) > 0, then x is in P,, and there is nothing 
to prove. If v(x) = 0, then (b) and Lemma 6.3 together show that we can write 
x = ab7', where a and b are members of R but not P. Since R/P isa field, we 
can choose c in R with bc in 1 + P. Then 


x —ac=a(b"! —c) =ab"'(1—be) = x(1 — be). 
The right side is a member of R, P, and (c) showed that R, P = P,. Therefore x 


is exhibited as the sum of the member ac of R and the member x(1 — bc) of Py, 
and we conclude that R + P, = Ry. This proves the first conclusion in (d). 


326 VI. Reinterpretation with Adeles and Ideles 


For the second conclusion in (d), we show inductively form > 1 that prt PY 
= P"~! the case n = | being what has already been proved in (d). Assume that 
case n has been proved. Multiplying the equality by P and using (c), we obtain 
P" + PP" = (PR,)P""! = P,P"! = P®. Since P C P,, the term PP” is 
contained in P”*!, but increasing the left side in this way does not increase the 
right side. Thus P” + P”*! = P”. This completes the induction. Using a second 
induction, we show that R + P”’ = R,. We have already proved this equality for 
n = 1. If we assume it for n and substitute from what has just been proved, we 
obtain R + (P” + Pe) = R,, and this proves case n + 1 since P” C R. The 
second conclusion of (d) thus follows by induction. 

For (e), we are assuming that R C R,, and we have defined P = RM P,. Thus 
the inclusion R — R,, when followed by the passage to the quotient R,/P,, 
descends to the quotient as a field map R/P — R,/ Py. By (d), any member x of 
R, is the sum of amember y of R anda member z of P,; then y+ P is the member 
of R/P that maps to x + P, in Ry/P,. Thus the field map R/P — R,/Py is 
onto, and (e) is proved. 


Corollary 6.6. Let R be a Dedekind domain regarded as a subring of its field 
of fractions F. If x is a member of F such that v(x) > O for every discrete 
valuation v of F satisfying R C R,, then x lies in R. 


PROOF. We may assume that x 4 0. Write x = ab~! with a and b in 
R. Theorem 6.5 shows that the valuations in question are the ones determined 
by the nonzero prime ideals of R. If the principal ideals (a) and (b) factor as 
(a) = Pj! .-. P* and (b) = Pi --- P&, thenO < up. (x) = vp. (ab7!) = jj —k; 
for 1 <i <r. Thus j; > k; for all i, and the fractional ideal (ab~!) equals the 
ot ah Pir which is contained in R. Hence x = ab! lies in R. 


product Pi 


A finite field has no discrete valuations because of the requirement that the 
image of a discrete valuation be Z U {+00}. If we drop this requirement in the 
definition and let a be a multiplicative generator of a finite field, then any discrete 
valuation v would have v(a*) = ku(a) by property (ii). Taking k equal to the 
order of a and using that v(1) = 0, we obtain v(a) = 0. Thus if we drop 
the requirement about the image of a discrete valuation, the only possibility has 
v(0) = +00 and v(x) = 0 for all x #0. Thus this setting is not very interesting. 

The settings in which discrete valuations v are of most interest to us are the 
following: 


(i) number fields, 
(ii) “function fields in one variable” over a base field,? 


3This notion has not been defined thus far in the book but will be treated in Chapter VII. The 
fields in question are finite algebraic extensions of a field k(X), where X is an indeterminate and k 


2. Discrete Valuations 327 


(111) fields obtained from (i) or (11) by a process of completion similar to that 
used in forming the field of p-adic numbers. 


The first of these are the initial subject matter of algebraic number theory, and the 
second of these are the initial subject matter of algebraic geometry — the geometry 
of curves. The third of these are used as a tool in studying the other two. Section 
VUI.7 of Basic Algebra explained parts of the analogy between the first two kinds 
of fields, and that is why we treat them together. We shall use Proposition 6.7 
below to determine their discrete valuations. In the case of (ii), the members of 
the base field k are regarded as constants, and the interest is only in valuations 
that are 0 on k*. 


Proposition 6.7. Let R be a Dedekind domain, let F be its field of fractions, 
let K be a finite algebraic extension of F,, and let T be the integral closure of R 
in K. If a discrete valuation v of K is > 0 on R, then itis > OonT. 


REMARKS. We make repeated use in this chapter of the fact that T is a Dedekind 
domain in this situation. This fact was proved as Theorem 8.54 of Basic Algebra 
for the case that K is a finite separable extension of F,, but it is valid without 
the hypothesis of separability. The result without the hypothesis of separability 
will be proved in Chapter VII as part of an investigation of separable and “purely 
inseparable” extensions. 


ProoF. If x 4 0 is in 7, then the minimal polynomial of x over R is a monic 
polynomial in T [X], and thus there exist an integer n and coefficients a,_1, ..., do 
in R such that 


x” =anyx™ 1 +--- tax tap. 


Properties (ii) and (11) of discrete valuations show from this equation that 
S. ; ; 
nv(x) =, min | (v(aj) + jv(x)) 


Since uv(aj) > 0, we obtain nv(x) > mino<j<n—1 ju(x), and it follows that 
u(x) => 0. Thus v is nonnegative on T. 


Corollary 6.8. The only discrete valuations of the field Q of rationals are the 
ones leading to the p-adic absolute value for each prime number p. If K is a 
number field and T is its the ring of algebraic integers, then the only discrete 
valuations of K are the valuations vp corresponding to each nonzero prime ideal 
P of T. 


is a field called the base field. At times later in the chapter, we shall be interested only in the case 
that the algebraic extension is separable. It will be proved in Chapter VII that for perfect fields k, 
this separability can always be arranged by adjusting the indeterminate X suitably. 


328 VI. Reinterpretation with Adeles and Ideles 


ProoF. If v is an arbitrary discrete valuation of Q, then property (iv) of discrete 
valuations shows that v(—1) = v(1) = 0, and property (ii) allows us to conclude 
that v is nonnegative on all of Z. Thus Z is contained in the valuation ring of v, 
and Theorem 6.5 applies. By (a) in the theorem, the intersection of Z with the 
valuation ideal is a nonzero prime ideal of Z, hence is pZ for some prime number 
p. Part (b) in the theorem then identifies v as the valuation corresponding to pZ. 
This proves the first conclusion. 

For the second conclusion, let v be a discrete valuation of K. The restriction 
to Q has to be a positive integral multiple of a discrete valuation of Q or else a 
function that is identically 0 on Q*. In either case, v is > 0 on Z, and Proposition 
6.7 shows that v is > 0 on T. If R, denotes the valuation ring of v and P,, denotes 
the valuation ideal, then this says that T C R,. We can therefore apply Theorem 
6.5. If P is defined by P = TM P), then (a) in the theorem shows that P is a 
nonzero prime ideal, and (b) shows that v = up. 


Let us now consider the field C(X), regarding it as having some properties in 
common with the number field Q. We want to know whether some analog of 
Corollary 6.8 is valid for C(X). The ring C[X] of polynomials is a principal ideal 
domain with C(X) as field of fractions, and the prime ideals of CLX] are all of 
the form (X — c) with c € C because C is algebraically closed. For each such 
c, we therefore obtain a discrete valuation v(x—,~). Are there any other discrete 
valuations? If we think geometrically about this question, we can regard C(X) 
as the rational functions on the Riemann sphere, and each discrete valuation 
addresses the order of vanishing of rational functions at some point of the sphere. 
For the points of the sphere that correspond to points c of C, such a valuation 
picks out the power of (X — c) by which the rational function should be divided 
in order to be regular and nonvanishing at c. The point oo on the Riemann sphere 
behaves differently. The usual technique in complex-variable theory is to replace 
X by 1/X and examine the behavior at 0. Following that prescription, we are led 
to a discrete valuation Ugo that is not of the form vp for some prime ideal P of 
C[X]. The definition of vo, on the quotient f(X)/g(X) of nonzero polynomials 
is 

Vool f (X)/g(X)) = deg g — deg f 


with vo.(0) = +00 as usual. The next proposition, which extends one of 
Liouville’s theorems in complex-variable theory* from C to a general field k, 
says that there are no other discrete valuations of interest for this example. 


Proposition 6.9. Let k be any field, and let F = k(X) be the field of rational 
expressions in one indeterminate over k. Regard F as the field of fractions of 


4For a meromorphic function on the Riemann sphere, the sum of the orders of the poles equals 
the sum of the orders of the zeros. 


2. Discrete Valuations 329 


the principal ideal domain k[X]. Then the only discrete valuations of F that 
are O on the multiplicative group k* of nonzero constant polynomials are the 
various valuations up), where p(X) is a monic prime polynomial in k[X’], and 
the valuation vo. that is defined on nonzero elements of F by 


Voo( f (X)/g(X)) = deg g — deg f 


if f and g are polynomials. Moreover, any nonzero h(X) in F has 


Vo(h)+ DY) Geg pup (h) =0. 
D(X) monic 
prime in R 


PROOF. Let v be a discrete valuation of F that is 0 on k*. First suppose that 
v(X) > 0. Being 0 on the coefficients, v is nonnegative on all polynomials. Thus 
k[X] is contained in the valuation ring of v, and Theorem 6.5 applies. By (a) in 
the theorem, the intersection of k[X] with the valuation ideal is a nonzero prime 
ideal of kLX], hence is (p(X)) for some monic prime polynomial p(X). Part (b) 
in the theorem then identifies v as the valuation corresponding to (p(X)). 

Next suppose that v(X) < 0. Since k[X~'] has k(X) as field of fractions, the 
argument in the previous paragraph is applicable, and we find that v is the valuation 
determined by the prime ideal (X —!) in k[X~!]. In particular, v(X) = —1. To 
find v(f) for a general polynomial f (X) = a,X”"+---+a,X +o in k[X] under 
the assumption that a, 4 0, we write f as X"(a, + +--+ a,X!~" + ayX~"). 
The member a, +--+ +a,X!~" +a) X~" of k[X~'] is not divisible by XxX, and 
thus v is 0 on it. Consequently v(f) = v(X") = nv(X) = —n = —deg f. If 
f and g are both nonzero in k[X], then it follows that v( f/g) = v(f) — v(g) = 
—deg f + deg g = voo(f/g). That is, v = Vo. 

To prove the displayed formula, write a given nonzero member h(X) of F as 
the quotient of two relatively prime polynomials, thus as h(X) = f(X)/g(X). 
Factor the numerator as f(X) = c[ Jj, pi(X )ki with c € k*, and factor the 
denominator similarly. If p(X) is a monic prime polynomial, then inspection of 
the formula for f (X) shows that u(,)(f) is k; if p = p; and is 0 otherwise. Hence 
Yip deg Pp (f) = VL, ki deg p; = deg f. Subtracting this formula and a 
corresponding formula for g, we obtain 


> (deg p)uip) (f/g) = deg f — deg g = —vo0(h), 
P 


and the result follows. 


Corollary 6.10. Let k be a field, let F = k(X) be the field of rational 
expressions in one indeterminate over k, let K be a finite algebraic extension of 


330 VI. Reinterpretation with Adeles and Ideles 


k[X], let T be the integral closure of kLX] in K, and let v be a discrete valuation 
of K that is 0 on the multiplicative group k*. Then the only possibilities for v 
are as follows: 


(a) v(X) => 0, and there exists a unique nonzero prime ideal P in T such that 
U= Up, 

(b) v(X) < 0, and there exists a prime ideal P in the integral closure T’ of 
k[X—!] in K such that PN k[X~!] = X~'k[X~!] and such that v is the 
valuation of K determined by P. 


REMARK. The ideals P that occur in (b) are the ones in the prime factorization 
of the ideal X~'T’ in T’. There is at least one, and there are only finitely many. 


PROOF. The argument is similar to the one for Corollary 6.8, except that 
we have to take into account what Proposition 6.9 says when v(X) < 0. The 
conclusion is that either v is > 0 on K[X], and then Proposition 6.7 and Theorem 
6.5 show that v is as in (a), or else v(X) < 0, and then Proposition 6.7 and 
Theorem 6.5 show that v is as in (b). 


To conclude, let us complete the remarks about fractional ideals begun early 
in this section. In the context that R is a Dedekind domain and F is its field of 
fractions, we mentioned that the nonzero fractional ideals of F form a group. We 
denote this group by Z. The nonzero principal fractional ideals form a subgroup 
P, and P is isomorphic to the multiplicative group F~. 

The point of the present discussion is that the group Z/P is isomorphic to 
the ideal class group of F as defined in the number-field setting in Section V.6. 
Recall the nature of this group. Two nonzero ideals J and J of R are equivalent 
if there exist nonzero members a and b of R with al = bJ. Proposition 5.18 
showed in the number-field setting that multiplication of such ideals descends to 
a multiplication on the set of equivalence classes and that the result is a group. 
This result holds for any Dedekind domain. The group is called the ideal class 
group of F; we denote it here by C. 

To verify that C = Z/P, we map each ideal J of R to its coset in Z/P. If J and 
J are equivalent ideals of R and al = bJ, then (ab!)I = J, and J and J map 
to the same coset. Thus C maps homomorphically into Z/P. If J maps into the 
identity coset, then x] = R for some x € F*. Writing x as ab~! witha and b in 
R shows that al = bR = (b), hence that J is equivalent to a principal ideal. Thus 
the homomorphism C — Z/P is one-one. Finally if M is any nonzero fractional 
ideal of F’, then we can find some x € F* withxM C R. Here x M is an ideal 
of R, and the equivalence of M and x M exhibits the class of M in Z/P as in the 
image of C. Consequently C = Z/P, as asserted. 


3. Absolute Values 331 
3. Absolute Values 


The next step in analyzing and generalizing the construction of the p-adic absolute 
value is to pass from the valuation, which appears in the exponent, to the absolute 
value itself. If F is a field, an absolute value on F is a function | - | from F to 
R such that 
(i) |x| > O with equality if and only if x = 0, 

Gi) |x + y| < |x| +|y| for all x and y in F, 

(ii) |xy| = |x||y| for all x and yin F. 
It follows directly that 

(iv) | — 1] = |1] = 1 and that 

(v) | —x| = |x| for all x in F. 


In fact, (iv) follows by combining (i) with (411) for x = y = 1 and then with 
(iii) for x = y = —1; then (v) follows by combining (iii) and (iv). The absolute 
value | - | on F is said to be nonarchimedean if the following strong form of (ii) 
holds:> 


ii’) |x + y| < max(|x|, |y|) for all x and y in F. 


Otherwise it is called archimedean. The inequality in (ii’) is called the ultra- 
metric inequality. When the ultrametric inequality holds, then the following 
additional condition holds: 

(vi) |x + y| = |x| whenever x and y in F have |y| < |x|. 

In fact, when |y| < |x|, (ii’) immediately gives |x + y| < |x|. But also (ii) and 
(v) give |x| < max(|x + y|,| — y]) = max(|x + yl, |y|). On the right side, the 
maximum cannot be |y| because |x| < |y| is false. Thus |x| < |x + yl], and (vi) 
holds. 

Although it might seem counterintuitive, it turns out that the archimedean 
absolute values are easier to understand than the nonarchimedean ones in the 
number fields and function fields of interest to us. 

Because of (iii), any absolute value of F when restricted to F™ is a multiplica- 
tive homomorphism into the positive real numbers. The image in the positive 
reals is therefore a group. 


EXAMPLES OF NONARCHIMEDEAN ABSOLUTE VALUES. 

(1) Let F be any field, and define |x| = 0 for x = Oand |x| = 1 forx #0. The 
result is a nonarchimedean absolute value called the trivial absolute value. It is 
of no interest, and we shall tend to exclude consideration of it from our results. 


>Some authors refer to a nonarchimedean absolute value as a “valuation,” using the same term 
as for the functions v(-) in Section 2. There is little danger of confusing the two notions, but we 
shall use the two distinct names anyway. 


332 VI. Reinterpretation with Adeles and Ideles 


Any other absolute value will be said to be nontrivial. Observe for a finite field 
F that the fact that x ++ |x| is a homomorphism from F™ to the positive reals 
implies that the only absolute value on a finite field is the trivial one. 


(2) Let F be any field, let v be a discrete valuation on F, and fix a real 
number r > 1. Then |x| = r~”) defines a nonarchimedean absolute value 
on F. Property (i) of absolute values follows because v(x) takes values in 
Z U {+00} and is infinite if and only if x = 0, property (ii’) follows be- 
cause v(x + y) > min(v(x), v(y)), and property (iii) follows because v(xy) = 
u(x) + v(y). In particular, the p-adic absolute value is obtained in this way when 
we take r = p, and we obtain corresponding examples for any number field F 
by taking v = vp and fixing r > 1, where P is any nonzero prime ideal in the 
ring of algebraic integers in F. For the function field F = k(X), we obtain 
corresponding examples by taking v = wp) and fixing r > 1, where p(X) is any 
monic prime polynomial in k(X). The choice v = vgo gives us another example. 
In all of these cases, the image of F * in R* under the absolute value is discrete 
in the sense that each one-point set of the image is open in the relative topology 
from the positive reals. Corollary 6.17 will show conversely that any absolute 
value for which the image in R* of the nonzero elements is discrete and nontrivial 
is obtained in this way from a discrete valuation. It is worth pausing to interpret 
some of the conclusions of Theorem 6.5 in terms of absolute values and metrics. 


Proposition 6.11. Let R be a Dedekind domain regarded as a subring of its 
field of fractions F’, suppose that | - | is an absolute value on F defined by means 
of a discrete valuation v, and suppose that the subset R, of F for which |x| < 1 
contains R. If P, denotes the subset of F with |x| < 1, then P = RM Py,isa 
nonzero prime ideal of R, and also 


(a) R is dense in Ry, 
(b) P” is dense in P? for every n > 1, 
(c) R/P = R,/Py. 


PROOF. In terms of v, the set R, is the valuation ring, and the set P, is the 
valuation ideal. The hypothesis R C R, is the hypothesis of Theorem 6.5. Part (a) 
of that theorem shows that P = RM P, is a prime ideal in R. Conclusions (a) and 
(b) here follow from Theorem 6.5d. In fact, let |x| = r~’“ withr > 1. Suppose 
that x is given in P” with n > 0 and that a positive number r~“’ is specified. 
We may assume that N > n. The condition for x to be in P”” is that |x| < r7”. 
Theorem 6.5d shows that we can find an xo in R such that x9 + y = x with y in 
P% hence with |y| < r~%. Then xo is in R and has |xo —x| = |y| < r7%. Hence 
Xo is within r— of x. Since |xo| < max(|x|, |y|) = max(r—",r—") =r7~", xo is 
in RM P” = P". Conclusion (c) is immediate from Theorem 6.5e. 


3. Absolute Values 333 


EXAMPLES OF ARCHIMEDEAN ABSOLUTE VALUES. If F is any subfield of R 
or C and if | - | is defined as the restriction to F of the ordinary absolute value 
function, then | - | is an archimedean absolute value. Remarkably it turns out that 
there are no other archimedean absolute values, apart from “equivalent” ones in 
the sense to be defined below. We return to this matter at the end of Section 4. 
Actually, we shall be interested in archimedean absolute values only when F is 
a number field or is all of R or all of C, and we will not need to invoke any deep 
theorem for the cases of interest to us. 


Properties (i), (ii), and (v) of absolute values show that the function d with 
d(x, y) = |x—y|isametric on F, and the next section will examine what happens 
when this metric is completed. The resulting fields will be generalizations of the 
field of p-adic numbers and will useful as tools in investigating number fields 
and function fields in one variable. 

Two absolute values | - |, and | - |, on the same field are said to be equivalent 
if there is a positive number @ such that | - |, = (| - |,)*. In our passage from 
a discrete valuation v to a nonarchimedean absolute value | - |, we fixedr > 1 
and defined |x| = r~”“), Changing r changes the absolute value to an equivalent 
absolute value. In the archimedean case a positive power of an absolute value 
need not be an absolute value, since the triangle inequality may fail. For example 
the ordinary absolute value on R satisfies the triangle inequality; so does its a” 
power for a < 1 but not fora > 1. 

Equivalent absolute values yield the same topology on F and in fact the same 
Cauchy sequences.° Conversely two absolute values that yield the same topology 
are equivalent, according to the following proposition. 


Proposition 6.12. Two nontrivial absolute values on a field F are equivalent 
if and only if 
{x €F | lxl, > 1} o {x € F| lxl, > 1}, 


if and only if they induce the same topology on F. 


REMARKS. If | - |, is the trivial absolute value, then the stated inclusion holds 
for all | - |,, but the equivalence may fail; that is why the statement has to exclude 
this case. The statement of the proposition remains true if the inequalities |x|, > 1 
and |x|, > 1 are replaced by |x|, < 1 and |x|, < 1, as we see by replacing x by 
ae 

PRooF. If the two absolute values are equivalent, then it is immediate from 
the definition of equivalent that equality holds in the stated inclusion. Conversely 


In many books an equivalence class of absolute values on a field is called a “place” of the field. 
We shall use this term in Sections 9 and 10 of this chapter, 


334 VI. Reinterpretation with Adeles and Ideles 


suppose that the inclusion holds. Fix x € F with |x|, > 1. Such an x exists 
because | - |, is nontrivial. Since |x|, > 1, there exists a real s > O with 
|x|, = |x|§. We shall show that | - |, =| - |5. 

Let y € F be arbitrary with |y|, => 1. Find the number r > 0 depending on 
y such that |y|, = |x|}. Let {a,/bn} be a sequence of positive rationals strictly 


decreasing tor such that a, and b, are both positive. Then |y|, = |x|, < iaslaee ie 


from which we obtain |y?"|, < |x|, and |x@y~'=|, > 1. By assumption, 
yy 1 ad y p 
|x” yl, > 1, and therefore |y|, < xe? oi Passing to the limit, we obtain 


Iyly < lath. 

Now suppose that |y|, > 1. Arguing similarly with a sequence of positive 
rationals strictly increasing to r, we obtain |y|, > |x|5. Thus |y|, = |x|5. Then 
we have 

yl, = bel, = bel’ = Lyl3 whenever |y|, > 1. (*) 


If instead |y|, = 1, then the number r in the second paragraph of the proof 
is O, and we obtain |y|, < |x|; = 1. Replacing y by y—! shows also that ly|, = 1. 
Thus |y|, = 1 implies |y|, = 1. 

The remaining case is that |y|, < 1. Then we apply (*) to y~~ and conclude 
that |y|, = |y|5 inthis case as well. This completes the proof of the first conclusion 
of the proposition. 

For the final statement we know that equivalent absolute values lead to the 
same topology. Conversely suppose that the absolute values are not equivalent. 
By what we have just shown, there exists x € F with |x|, > land|x|, < 1. Then 
{x~"} is a sequence convergent to 0 in the topology from | - |, but not convergent 
to 0 in the topology from | - |,. Therefore the topologies are different. 


1 


Proposition 6.13. If | - | is an absolute value on the field F,, then the topology 
on F induced by the associated metric makes F into a topological field. 


REMARK. The proof is similar to part of the argument that proves Proposition 
6.1 except that the general triangle inequality has to be used in place of the 
ultrametric inequality. 


PROOF. To see that addition, subtraction, and multiplication are continuous on 
F, let {x,} and {y,} be convergent sequences in F' with respective limits x and y. 
Use of the triangle inequality on F gives 


(Xn + Yn) — (& + y)| = [On — X) + On — YI XS [Xn — XI +19 — YI. 
The right side has limit 0 in R, and therefore x, + y, has limitx + yin F. A 


completely analogous argument, making use also of the equality | — 1| = |1], 
shows that subtraction is continuous. Consider multiplication. If M is an upper 


3. Absolute Values 335 


bound for the absolute values |x,,|, then use of the multiplicative property of the 
absolute value on F gives 


Xn Yn — XY| = |\XnOn — Y) + ¥Qn — X)| XS len On — YI + lL¥On — x)! 
= [Xallya — yl + lyllen — x| < Ml yn — yl + ly len — XI. 


The right side has limit 0 in R, and therefore x,y, has limit xy in F. 

To see that inversion x +> x7! is continuous on F%, let {x,} be a sequence in 
F* with limit x in F*. Since lim, |x,| = |x|, we can find an integer N such that 
Ixn| = 5 |x| forn > N. The computation 
-1 


Ixp) — x7" | = | — Xn) / tnx) = |e — Xnl/Cxalle)) < Ql] |x — xa, 


valid for n > N, then shows that lim x, ' — x-!) and inversion is continuous. 


Consequently F is a topological field. 


We now give a few results that limit the kinds of absolute values that can arise 
in particular situations. 


Proposition 6.14. If | - | is an absolute value on the field F for which there 
is some c with |n| < c for all integers n € Z, 1.e., for all additive multiples of 1, 
then | - | is nonarchimedean. In particular, | - | is necessarily nonarchimedean if 
F has characteristic different from 0. 


REMARK. When c exists, then c can be taken to be 1, since the image of F'” 
under the absolute value is a subgroup of the positive reals and the only bounded 
such subgroup is {1}. 


PRrooF. If x and y are in F and if n is any positive integer, then the Binomial 


Theorem gives (x + y)" = Yio Ca y/. Therefore 


n . P 

Ix + yl" = ¥° |G) [le Syl 
j=0 

n . 

<c )} max(|x], |y|)”-/ max(|x], |y|)/ 

j=0 


= c(n + 1) max(|x], |y|)”. 


Extraction of the n™ root gives |x + y| < c!/"(n + 1)!/" max(|x], |y|). Passing 
to the limit, we obtain |x + y| < max(|x], |y]). 


336 VI. Reinterpretation with Adeles and Ideles 


Theorem 6.15 (Ostrowski’s Theorem). If | - | is a nontrivial absolute value 
on the field Q, then | - | is equivalent either to the p-adic absolute value | - | ip for 
some prime number p or to the ordinary absolute value | - |p. 


REMARKS. No two of these are equivalent because {p”} tends to 0 relative to 
the p-adic absolute value, {p~"} tends to 0 relative to the ordinary absolute value, 
and p” has absolute value 1| relative to the ¢-adic absolute value for all prime 
numbers £ # p. 


PROOF. First suppose that every integer n has |n| < 1. Proposition 6.14 shows 
that | - | is nonarchimedean. Since | - | is nontrivial, we must have |n| < 1 for 
some n, and we may take n to be positive. Since |n| is the product of |p| over 
all primes dividing n, multiplicities included, some prime number p has |p| < 1. 
Let us see that p is unique. If, on the contrary, |g| < 1 fora second prime number 
q, choose integers a and b with ap + bq = 1. Then 1 = |1| = lap + bq| < 
max(|ap|, |bq|) = max(|a||p|, |b\|g|) < max(|p], |q|) < 1, contradiction. If we 
now define a positive real w by |p| = p “, then it follows that |n| = (|n| a for 
all integers n. Therefore | - | = (| - | Be on all of Q. 

Now suppose that n is some integer with |n| > 1. We may assume that n is 
positive. For any positive integer m, the triangle inequality gives 


Jm| =[1+--- +1) <[1[+---+ [1] =m. 


In particular we have |n| = n® for some real a withO < a < 1. 
We shall prove that 
|m| < m* (*) 


for all positive integers m. We start by expanding m to the base n, writing 


2 k-1 
m=cotcn+con +:-+-+cpjn , 


where k is the integer such that n‘~! < m < n* and where each c; satisfies 
0 <c; <n. The triangle inequality gives 


2 k-1 
|m| < |co| + |ei||n| + lealla|? +--+ + lex—1] lal 


an = DO n an ess ae?) by definition of a 
_ @=1)n™* _ (a — In" ak) 
n* — 1 n* — I 
— 1)n% 
< as ie since n‘—! < m. 
n* — I 


In other words, there is a positive number C independent of m such that |m| < 
Cm* for every positive integer m. For every positive integer N, we then have 


3. Absolute Values 337 


|m|N = |m% | < Cm®, and thus |m| < C!/Nm®. Letting N tend to infinity, we 
obtain (+). 
Let us now improve («) to the equality 


m| = m* for every positive integer m. 2k 
yP g 


1 k 


The integer k above has nk! < m < nk. Putd = n* — m; this satisfies 


0 <d <n* —n*"". Then 
n= In|F = |n*| < |m| + |d| < |m| + d® < |m| + on *, 
and consequently 
len ao Sn emi, ) am ia): 


Thus |m| > C’m® for some positive constant C’ independent of m. For every 
positive integer N, we then have |m|% = |m%| > C’m®% and hence |m| > 
C’'/Nm*. Letting N tend to infinity, we obtain |m| > m®%. In combination with 
(), this proves (+). 

Since | — m| = |m|, the equality (**) implies |m| = (|m|,)% for every integer 
m. Taking quotients, we obtain |¢| = (|¢|p)* for every rational q. 


Corollary 6.16. If | - | is a nontrivial absolute value on a number field F’,, then 
the restriction of | - | to Q is nontrivial. 


REMARK. In view of Ostrowski’s Theorem (Theorem 6.15), the restriction to 
Q therefore has to be equivalent to the p-adic absolute value for some p or to the 
ordinary absolute value. 


PROOF. Since | - | is nontrivial, there exists x with |x| > 1. Raising x to 
a power if necessary, we may assume that |x| > 2. Arguing by contradiction, 
suppose that |g| = 1 for all nonzero q in Q. Since x is algebraic over Q, there 
exist an integer n > 1 and rational coefficients gy_1, ..., go such that 


x" = qyix" | +++» +q1x + Qo. 


Applying | - | to both sides and using that |g;| < 1 for all j gives 


Bel ees 
eee | < |x|" -1, 
|= 


the right-hand inequality holding because |x| > 2. We have thus obtained |x|" < 
|x|” — 1 and have arrived at a contradiction. 


338 VI. Reinterpretation with Adeles and Ideles 


An absolute value | - | on a field F such that the image of F~ is discrete is 
called a discrete absolute value. The p-adic absolute values on Q and on Q, 
furnish examples. 


Corollary 6.17. If | - | is a nontrivial discrete absolute value on the field F, 
then | - | is nonarchimedean, and |x| = r~’ for some discrete valuation of F. 


REMARKS. Example | of nonarchimedean absolute values shows that discrete 
valuations always lead to discrete absolute values. This corollary is a converse. 
The trivial absolute value is of course nonarchimedean, but it does not arise from 
a discrete valuation. We shall not be interested in any nonarchimedean absolute 
values that do not arise from discrete valuations. 


PROOF. First we show that | - | is nonarchimedean. Proposition 6.14 imme- 
diately handles the case that F has nonzero characteristic, and we may therefore 
take the characteristic to be 0. Let D be the discrete image subgroup of F'*. This 
D in particular must contain the image of Q*. Meanwhile, Theorem 6.15 says 
that the restriction of | - | to Q has to be trivial, or equivalent to the p-adic absolute 
value for some p, or equivalent to the ordinary absolute value. Under the ordinary 
absolute value, the image of Q* cannot be contained in D, and the restriction 
must be one of the other kinds. For all of the other kinds, the image of Z is 
bounded, and Proposition 6.14 allows us to conclude that | - | is nonarchimedean. 

Now that | - | is nonarchimedean, we set v(0) = +00 and v(x) = — log. |x| 
for x # 0. Properties (i), (ii’), and (iii) of nonarchimedean absolute values 
immediately imply the three defining properties of a discrete valuation. 


Corollary 6.18. If | - | is a nontrivial discrete absolute value on a field F, then 
the corresponding valuation ring R = ti eF | |x| < 1} and the valuation ideal 
P= {x € F | |x| < 1} are open and closed in F. 


REMARK. Corollary 6.17 shows that | - | is defined by a discrete valuation. 


PROOF. The definitions of R and P in the statement show that R is closed 
and P is open. Let D be the image of F* under | - |. A discrete subgroup 
of positive reals has to be equal’ to {1} or to the subgroup r” for a unique real 
r > 1. The nontriviality of | - | implies that the correct alternative is r“. Then 
the equality R = ies e€ F | |x| < r} shows that R is open, and the equality 
P = {x € F| |x| <r7'} shows that P is closed. 


Next we prove a general result applicable to number fields and to function 
fields in one variable that yields the conclusion that nonarchimedean absolute 
values in these cases are automatically discrete. The general result is obtained in 
two parts, stated as Lemma 6.19 and Proposition 6.20. 


7One can invoke Lemma 5.14, for example. 


3. Absolute Values 339 


Lemma 6.19. If R is a Dedekind domain regarded as a subring of its field of 
fractions F’,, and if | - | is a nonarchimedean absolute value on F that is < 1 on 
R, then | - | is discrete. Hence either | - | is trivial or else it is defined by the 
valuation relative to a nonzero prime ideal of R. 


PROOF. The subset of x € R for which |x| < 1 is a proper ideal J in R, and 
we let P be a prime ideal containing J. Since R is a Dedekind domain, P defines 
a corresponding discrete valuation vp. Let |x|» = 2~’?. Then 


jPeR| hi Slla=PePr= iver! lp <1}, 


and hence 
{x ER| lxlp =1} C {x eR] |x| = 1}. (x) 


Let z be an element of R with |z|p = 5. If x is an arbitrary nonzero member of 
F with |x|» < 1, then Proposition 6.4 shows that we can write x = *x' with 
k > 0,x' in R, and |x|, = 1. Then |x’| = 1 by («), and it follows that |x| = | |*. 
Since |x|p = ||‘ also, there are only two possibilities. One possibility is that 
|x| = || = 1 for all x 40, and then | - | is trivial. The other possibility is that 
the subsets of F for which |x| < 1 and for which |x|» < 1 coincide. In this case 
we apply Proposition 6.12 and conclude that | - | and | - |, are equivalent. 


Proposition 6.20. Let R be a Dedekind domain regarded as a subring of its 
field of fractions F,, let K be a finite algebraic extension of F’, and let T be the 
integral closure of R in K. If | - | is anonarchimedean absolute value on K that 
is < 1 on R, then itis < 1 on T. Hence | - | is discrete, and either | - | is trivial 
or else it is defined by the valuation relative to a nonzero prime ideal of T. 


PROOF. As with Proposition 6.7, T is a Dedekind domain. If x 4 0 is in T, 
then the minimal polynomial of x over R is a monic polynomial in R[X], and 
thus there exist an integer n and coefficients a,_;,...,@q in R such that 


Ne aa ee aie eae, 


Taking the absolute value of both sides and using the nonarchimedean property, 
we obtain 


|x|" <_ max (laj||x|/) < max (\x|/) = max(1, |x|""1), 
O<j<n-1 * O<j<n-1 


the inequality holding because | - | is assumed to be < 1 on R. If we could have 
|x| > 1, then this inequality would read |x|" < |x|"~!, which is a contradiction. 
We conclude that |x| < 1 for all x € T. The conclusions in the last sentence of 
the proposition now follow from Lemma 6.19. 


340 VI. Reinterpretation with Adeles and Ideles 


Corollary 6.21. If K is a number field, then every nontrivial nonarchimedean 
absolute value | - | on K comes from the valuation vp relative to some nonzero 
prime ideal P in the ring of algebraic integers in K. 


REMARK. Proposition 6.27 below will classify the archimedean absolute values 
on a number field. 


PROOF. Since | - | is nonarchimedean, its restriction to Q is nonarchimedean. 
By Ostrowski’s Theorem (or by inspection), it is < 1 on Z. The result now 
follows from Proposition 6.20 if we take R to be Z and F to be Q. 


Corollary 6.22. Let k be a field, let F = k(X) be the field of rational 
expressions in one indeterminate over k, let K be a finite algebraic extension of 
k[X], let T be the integral closure of k[X] in K, and let | - | be a nontrivial 
nonarchimedean absolute value on K that is 1 on the multiplicative group k*. 
Then | - | is discrete, and the only possibilities for it are as follows: 

(a) |X| < 1, and there exists a unique nonzero prime ideal P in T such that 
| - | comes from the valuation determined by P, 

(b) |X| > 1, and there exists a prime ideal P in the integral closure T’ of 
k[X~!] in K such that PQ k[X~!] = X~'k[X7!] and such that | - | 
comes from the valuation of K determined by P. 


REMARKS. As with Proposition 6.7, T and T’ are Dedekind domains. If 
k has nonzero characteristic, then Proposition 6.14 shows that every absolute 
value is nonarchimedean. For the case that k has characteristic zero, remarks at 
the end of Section 4 will indicate why every absolute value that is 1 on k* is 
nonarchimedean; we shall not need to make use of this fact, however. In any 
event, just as with Corollary 6.10, the ideals P that occur in (b) are the ones in 
the prime factorization of the ideal X -!T’ in T’; there is at least one, and there 
are only finitely many. 


PROOF. The argument is similar to the one for Corollary 6.21, except that we 
have to take into account what happens when |X| > 1. We apply Proposition 
6.20 either with R = k[X] or with R = k[X~'}. 

Since | - | is 1 on k*, an inequality |X| < 1 implies that | - | is < 1 on kLX], 
| - | being assumed to be nonarchimedean. Then Proposition 6.20 and Corollary 
6.10 show that (a) holds. Similarly an inequality |X| > 1 implies that | - | is < 1 
on k[X~!] because | - | is assumed nonarchimedean. Then Proposition 6.20 and 
Corollary 6.10 show that (b) holds. 


Theorem 6.23 (Weak Approximation Theorem). Let | - |,,...,| + |, be 
inequivalent nontrivial absolute values on a field Ff’. If € > O is a real number 
and x}, ..., X, are elements of F,, then there exists y in F’ such that 


ly — xl; <€ forl<j<n. 


3. Absolute Values 341 


REMARKS. The special case of this theorem in which F is a number field and 
the absolute values are defined by n distinct nonzero prime ideals in the ring of 
algebraic integers follows from the Chinese Remainder Theorem (Theorem 8.27 
of Basic Algebra, restated in the present book on page xxv). In fact, it is enough 


to handle the case that all the x;’s are algebraic integers in F. Let the prime ideals 


be Pi,..., Pn, and let| - |; = fe. we with r; > 1. If we specify any positive 
integers i ...,Ky, then ihe Chinese Remainder Theorem produces an algebraic 


integer y in F such that y = x; mod Pe for 1 < j <n. These congruences say 


that vp, (y — xj) = k;, hence that |y — xj; <7; ’. Thus we have only to choose 


ki,...,k, large enough to make ee < € for all j, and the inequalities of the 


theorem will hold. 
PROOF. First let us prove that we can find an element z in F with 
Iz|, >1 and [ele ed for2 <j <n. () 


We do so by induction on n, the case n = 2 being Proposition 6.12. Assuming 
the result for n — 1, find u with |u|, > 1 and lul; < 1for2 < j <n—1. Thenby 
the result for n = 2, find v with |v|, > 1 and |v|,, < 1. Letk > 0 be an integer 
to be specified, and put 


v if |u|, <1 
z= 9 uky if |u|, = 
Hie a 
ae ble, ed 


In the second case, k is to be chosen large enough to make | |u| ws 1 for 
2 < j < n-—1. In the third case, k is to be chosen large enough to make 
[lf + lull > 1, eC — luli) ot vl; < 1 for 2 < j <n—1, and 
|u|* (\u|* =1)"! |u|, < 1. Then z satisfies the conditions in (*), and the inductive 
proof of (*) is complete. 

Applying (*), find z; such that |z; F > Land |z;|; < | fori A j. Let] bea 
positive integer to be specified, and put 


n x2! 
i: dX I+) 
Since y — xj = —xj(1 +z) + Ya, riz} + 2), we obtain 


ly — xjlj < lel (lzslj —1)'+ » lal (Ize — lzif)7'). (*) 


For / large enough, the coefficients (l,l —1)7! and lzilj ad — lzilj )! fori £ j 
can be made as small as we please, and thus the right side of (*««) can be made to 
be <e. 


342 VI. Reinterpretation with Adeles and Ideles 
4. Completions 


In this section we finish our project of establishing an abstract theory that gener- 
alizes the construction of the field of p-adic numbers. A little care is appropriate 
in stating the results. Here is an example of the cost of imprecision: We know 
that the field Q, is obtained by completing Q with respect to the p-adic absolute 
value. We shall see in Section 5 that Q, for p = 5 is obtained also by completing 
the field Q() with respect to a certain absolute value and that in fact there are 
two distinct equivalence classes of absolute values on Q(i) for which Qs results 
in this way. Thus a completion process is not well specified unless we include all 
the data—the original field, the absolute value on it (or at least the equivalence 
class of absolute values), and the mapping into the completed space. 

For this reason we introduce the notions of a valued field, namely a pair 
(F, | - |-) consisting of a field and an absolute value on it, and a homomorphism 
of valued fields. If (F, | - |,,) and (K, | - |,) are the two valued fields in question, 
a homomorphism from the first to the second is a field map g : F — K such 
that |x|, = |g(x)|x for all x in F. We write y* for the corresponding operation 


of restriction: g*(| - |x) =| - |. If @ carries F onto K, then ¢ is called an 
isomorphism of valued fields. 
A completion of a valued field (F, | - |,-) is defined to be a homomorphism 


of valued fields g : (F, | - |~) > (K,| - |x) such that (K, | - |,) is complete as 
a metric space and g(F) is dense in K. The first theorem establishes existence. 


Theorem 6.24. Let F be a field with a nontrivial absolute value | - |,, let 
d be the associated metric on F, let R be the subring of Wy F consisting of 
all Cauchy sequences relative to d, and let Z be the ideal in F consisting of all 
sequences convergent to 0. Then Z is a maximal ideal in R, and the quotient R/Z 
is a field. Consequently the Cauchy completion of F relative to d is a topological 
field F = R/T. Leti : F + F bethe natural map F > R > R/T of F into the 
Cauchy completion given by carrying members of F' into constant sequences in, 
followed by passage to the quotient. The metric d on the Cauchy completion is the 
unique continuous function d : F x F > Rsuch that d(i(x), i (y)) = d(x, y). If 
areal-valued function | - | -is defined on F by Ix |= = d(x,0) forx € F,then| - le 
is an absolute value on F, andi: (F,| - ln) > (F,|- lz) is a homomorphism 
of valued fields. Moreover, the absolute value on F is nonarchimedean if the 
absolute value on F is nonarchimedean. 


REMARKS. The usual construction of the Cauchy completion embeds the 
original metric subspace as a dense subset of a complete metric space, and 
therefore this theorem is showing thati : (F,|-|-) > (F,|- la) is acompletion 


of (F,| + |p). 


4. Completions 343 


PROOF. The proof of this theorem is almost the same as the first part of the 
proof of Proposition 6.1, apart from notational changes. The differences occur in 
spots where the ultrametric inequality was invoked in the proof of Proposition 6.1 
and only the triangle inequality is available here. The main such difference is the 
argument that the validity of the triangle inequality on F implies the validity of the 
triangle inequality on F, and we give that argument ina moment. Correspondingly 
it is unnecessary for us to prove that the validity of the ultrametric inequality on 
F implies the validity of the ultrametric inequality on F’, because that argument 
does occur in the proof of Proposition 6.1. 

The other places in the proof of Proposition 6.1 where the ultrametric inequality 
was used are in the proof that the completion is a topological field. It is not 
necessary to modify that proof here, however, since we can invoke Proposition 
6.13. 

Thus let us see that the validity of the triangle inequality on F implies the 
validity of the triangle inequality on F. To proceed, let x and y be members of 
F = R/T, and let {g,} and {r,,} be respective coset representatives of them in R. 
Then {qg, + 1,} is a representative of x + y, by definition, and the continuity of 
| - lz on F implies that lim, |gn + rpl p= |x + y|,. From this limit formula and 
the triangle inequality for F', we obtain 


|x + ylp= lim |qn +1) F < lim sup(|gnlz + I'nlp) 


< lim sup lQnlz + lim sup I'nlz = tle + lyle> 
n n 


since limy |Gn le = |x lz and lim, |r, le = | Vie This proves the triangle inequality 
on F. 


A valued field (L, | - |,) is said to be complete if L is Cauchy complete in 
the metric defined by | - |, . In Section 6 we shall make crucial use of a universal 
mapping property of the completion of a valued field. 


Theorem 6.25. If: (F,| + |p) — (K,| + |x) is acompletion of the valued 
field (F, | - |-) andifg: (F,|- |) — (LZ, | - |,) isahomomorphism of valued 
fields with (L,| - |,) complete, then there exists a unique homomorphism of 
valued fields ® : (K,| - |x) > (L,| - |,) such that = ® ou. 


REMARKS. As usual with universal mapping properties, this theorem implies 
a uniqueness result: any two completions of a valued field are canonically iso- 
morphic. It is not necessary to write out the details. Making a small adjustment 
to the proof below, we see also that if a field has two equivalent absolute values 
on it, then the corresponding two completions are canonically isomorphic by a 
field map that respects the topologies. 


344 VI. Reinterpretation with Adeles and Ideles 


PROOF. The theory of completion of a metric space produces a unique con- 
tinuous function ® : K — L such that g = ® o1, and this continuous function 
respects the metrics. It is necessary to check only that ® respects addition and 
multiplication. 

The argument is the same for the two operations, and we check only addition. 
Let x and y be given in K, and choose sequences {x,} and {y,} in F with 
lim t(x,) = x, limt(y,) = y. Since addition is continuous in K, limi(x, + yn) = 
x + y. Since ® is a continuous function with g = ® o1, 

P(x) + O(y) = Pdlimi(x,)) + P(lim i(yp)) 
= lim(®(n))) + Lim(® (yn) = Lim(~@n)) + lim(Y(yn)) 
= lim(gQn) + Gn) = limOn + Yn)) 
= lim(®i&n + Yn)) = P(limtn + Yn)) = P(x + y), 
and © respects addition. 


Theorem 6.24 generalizes the parts of Proposition 6.1 concerning Q,, but not 
those concerning Z,. The arguments concerning Z, transparently made use of 
the ultrametric inequality, and they used a little more. The extra fact used is 
that the p-adic absolute value is defined from a discrete valuation. In view of 
Corollary 6.17 and Example 1 of nonarchimedean absolute values in the previous 
section, a necessary and sufficient condition for a nontrivial absolute value on a 
field F to be obtained from a discrete valuation is that the image of F* under 
the valuation be a discrete subset of the positive reals. Such an absolute value is 
automatically nonarchimedean. 


Theorem 6.26. Let. : (F,| - |) > (F,| - I>) be a completion of a valued 


field, and suppose that | - |,, is nontrivial and discrete. Let v(-) be the discrete 
valuation that defines | - | on F. Then 
(a) the image |F” |; equals the image |F'*|,,, and | - |, on F is therefore 
defined by a discrete valuation 0(-) on F such that 0 ov = v, 
(b) the image 1(R) of the valuation ring R of v is dense in the valuation ring 
R of v, 
(c) for every integer n > 0, the image 1(P”) of the n™ power P” of the 
valuation ideal P of v is dense in the n™ power P’ of the valuation ideal 
P of v, 
(d) the residue class fields of F and F coincide in the sense that the mapping 
u: R — R descends to a field isomorphism of R/P onto R/P, 
(e) for every integer n > 0, the mapping. : R — R descends to a ring 
isomorphism of R/P” onto R/P’, 
(f) R is compact if R/P is finite, and in this case the topological field F is 
locally compact. 


4. Completions 345 


REMARK. No assertion is made in (d) and (e) about whether the topologies 
match under the constructed isomorphisms. Our interest will be mostly in the case 
that R/P is finite, in which case the topologies match because they are discrete. 


PROOF. Write |F*|,, in the form r” for a unique real number r > 1. For 
(a), since Ole = |x|, and since .(/) is dense in F, the continuity of the 
absolute value | - |;, implies that the image of F is contained in the closure of 
r” within the positive reals, which is r”. The formula v 01 = v follows from the 
computation r~"@) = Ix|p = eal = r~@)) by taking the logarithm to the 
base r. 

For (b) and (c), we use that 1(F’) is dense in F, and we treat (b) as the casen = 0 
of (c). Fix n > 0 and consider P". Choose a sequence {x;,} in F with {c(x;,)} 
converging to a point x in P”. Since |x|. <r~", we must have |x;|, < a 
for all sufficiently large k. The elements x, satisfying this condition are in P”, 
and thus (P”) is dense in P”. 

For (d) and (e), the mapping R > R/P” descends to R/P”, since i(P) C P. 
The descended map is one-one, since if x € R maps to the 0 coset, then x is in 
.-'(P") = P". To see that the descended map is onto, let a coset x + P" be 
given. Since 1(R) is dense in R, we can choose x € R with |u(x) — Xl ape, 
Since P” = {y € Fllyl <r-"*!}, o@) — & is in P". Hence 1(x) is exhibited 
as in x + P”, and the coset x + P” maps to the coset x + Pp 

In (f), Corollary 8.60 of Basic Algebra shows that P” / P”*! is a 1-dimensional 
vector space over R/P. The First Isomorphism Theorem gives an R module 
isomorphism (R/P"t!)/(R/P") = P"/P"*!, and it follows by induction on n 
that the finiteness of R/P implies the finiteness of R/P”. In view of (e), R/P" 
is finite for every n > 0. 

For each n > 0, the set R is covered by the cosets of P’, which are closed 
balls in F of radius r~” and open balls of radius r~"*!. Thus for any positive 
radius, there exists a finite collection of open balls of that radius or less such that 
the union of the open balls covers R. This means that R is totally bounded in the 
metric space F. A totally bounded closed subset of a complete metric space is 
compact, and consequently R is compact. 

Thus the 0 element of F has R as a compact neighborhood. Since addition is 
continuous, each member x of F has x + R as a compact neighborhood of x, and 
therefore F is locally compact. 


Let us review briefly. We start with an absolute value on a field F. The 
cases of initial interest are that F is a number field or is a function field in one 
variable, namely a finite algebraic extension of a field k(X), where k is a given 
base field; in the latter case we assume that the absolute value is identically 1 
on k*. A number field can have archimedean absolute values, and we come 


346 VI. Reinterpretation with Adeles and Ideles 


to them in a moment. In the function-field case we know that every absolute 
value is nonarchimedean if k has nonzero characteristic; this remains true for 
characteristic zero but we did not prove it. For our cases of interest the nonar- 
chimedean nontrivial absolute values are always given by a discrete valuation. 

Thus let us summarize what happens for a nonarchimedean nontrivial absolute 
value that is given by a discrete valuation. Within the given field F we have 
singled out a Dedekind domain R for which F is the field of fractions,® and the 
absolute value is < 1 on R. For example, in the number-field case R is the ring 
of algebraic integers in F. In all cases the discrete valuation v is determined by a 
nonzero prime ideal p of R, and the absolute value on F is given by |x|; = r7"™ 
for some number r > 1. Our two-step process consists in a step of localization 
and a step of completion. The step of localization passes to the principal ideal 
domain S~!R with maximal ideal S~'p, where S is the complement of p in R. 
The domain S~'R coincides with the valuation ring of v, and the ideal S~'p 
coincides with the valuation ideal of v. The absolute value on F' does not change 
during this process of localization. The ideal S~'p is principal in S~! R, say with 
qt as a generator. The element z can be chosen to be in p, and it has v(zr) = 1. 
Theorem 6.5 and Proposition 6.11 govern relationships between R and S~'R. 
Briefly the powers of p are dense in the powers of S~'p, and the natural map of 
residue class fields R/p > S~'R/S™~'p is a field isomorphism onto. 

The second step is a step of completion with respect to the absolute value. 
The completion of a valued field (F, | - |,-) is a homomorphism of valued fields 
L:(F,| + |p) > (Z,| - |,) such that (L,| - |,) is complete as a metric space 
and z carries F onto a dense subfield of L. This exists by Theorem 6.24. In 
the situation with a nonarchimedean nontrivial absolute value that is given by a 
discrete valuation, one often writes F, for the completed field L. The eventual 
interest is partly in what happens to R and p, but we first consider S~'R and 
S~'p. The completed absolute value | - | F, is given by a discrete valuation v 
with vo. = v. Let us write Ry for its valuation ring and py for its valuation 
ideal. Theorem 6.26 governs the relationships between S~'R and Ry. Briefly 
the images under z of the powers of S~'p are dense in the powers of pp, and the 
natural map of residue class fields S~'R/S~!p > Ry/Pp induced by z is a field 
isomorphism onto. 

The case of most interest for number theory is the case of a number field F and 
the absolute value determined by a nonzero prime ideal p in the ring of algebraic 
integers of F’. The field F, is called the field of p-adic numbers, and the ring 
Ry is called the ring of p-adic integers. When F = Q and p = pZ for a prime 
number p, the element z can be taken to be p. 


8The case R = F is excluded; this is the case that produces the trivial absolute value, which 
does not interest us. 


4. Completions 347 


In the case of a function field in one variable that is most analogous to a 
number field, one starts from a field F that is a finite algebraic extension of 
F(X), where F, is a finite field with g elements. According to Corollary 6.22, 
all but finitely many of the nonarchimedean absolute values are defined in terms 
of nonzero prime ideals in the integral closure of F,[X] in F; the others are 
the prime constituents of the ideal X ag [X~!] in F,[X ~!). One can show that 
the ring in the completion analogous to Ry is always a ring of formal power 
series F,/[[X]] in one indeterminate X and with coefficients in a finite extension 
IF, of F,. Elements of this ring are arbitrary formal power series of the form 
Yeo ce X* with all cy in Fy. The field of fractions analogous to Fy is always a 
field of formal Laurent series F, ((X)) in one indeterminate; nonzero elements 
of this field are arbitrary expressions of the form )°?°_y c,X* with all c, in Fy, 
with c_y 4 0, and with N depending on the element. 


Let us now examine archimedean completions. We shall discuss what happens 
when we start from a number field, and then we make some remarks without proof 
about the general case. Thus let F be a number field, and let an archimedean 
absolute value be given on it. To have notation parallel to the nonarchimedean 
case, it is customary to index the absolute value? by a symbol like v, writing | - |, 
for it. Corollary 6.16 shows that the restriction of | - |,, to Q is nontrivial, and the 
combination of Proposition 6.14 and Ostrowski’s Theorem (Theorem 6.15) shows 
that the restriction to Q is equivalent to the ordinary absolute value. Adjusting 
| - |,, within its equivalence class, we may assume that its restriction to Q matches 
the ordinary absolute value. Using Theorem 6.24, we form the completion of F 
with respect to | - |,,, writing F, for the completed space. The limits of Cauchy 
sequences from Q itself show that R lies in the completed space, since | - |,, 
matches the ordinary absolute value on Q. Thus we can regard R as a subfield 
of F,, and F is a subfield as well. Consequently the set RF of sums of products 
is a subring of F',. The multiplication mapping of R x F into F, is Q bilinear 
and has a linear extension R @g F — F, whose image is RF. The R dimension 
of R @g F is [F : QJ], and consequently the R dimension of RF is < [F : Ql], 
hence finite. Being a finite-dimensional R algebra embedded in a field, RF is a 
subfield'® of F,. It is therefore a finite algebraic extension of R and must be R 
or C. Thus F lies in R or C. The fields IR and C are complete relative to the 
ordinary absolute value, and hence RF is aclosed subset of F,. Since F is dense, 
we conclude that F,, is R or C. 

Visualize having a standard copy of C available, with R embedded in it. From 
the above remarks, any archimedean absolute value of the number field F’, after 


°Or the equivalence class of the absolute value. 
!OWithin a field if a nonzero element is algebraic over a base field, then the smallest ring containing 
the base field and the element contains also the inverse of the element. 


348 VI. Reinterpretation with Adeles and Ideles 


adjustment within its equivalence class, yields a completion that takes one of the 
two forms 


OF | =e Rs) and oF, | ee CoD 


where | - | is ordinary absolute value on R or C. Conversely any field mapping o 
of F into R or C has dense image either in R or in C and defines an archimedean 
absolute value on F' by | - |, =o*(| - |). Theno: (F,| - |,) > (RorC,| - |) 
is acompletion by Theorem 6.25. 

To classify the archimedean absolute values up to equivalence, we recall from 
Section V.2 that the number of distinct field maps o into C of a number field 
F of degree [F : Q] = n is exactly n, with a certain number 7; of them having 
image in R and with the remainder 2r7 having image in C but not R and occurring 
in complex conjugate pairs. Each such field map o gives us a completion. The 
members of a complex conjugate pair result in the same absolute value on F when 
the ordinary absolute value of C is restricted to F. We shall show that there are 
no other equivalences. 


Proposition 6.27. Let F be a number field with [F : Q] = n, and let there 
be r; distinct field maps of F into R and rz complex conjugate pairs of distinct 
field maps of F into C, with r; + 2r2 = n. Each such field map o induces an 
archimedean absolute value on F by restriction from R or C, the only equivalences 
are the ones from pairs of field maps related by complex conjugation, and the 
resulting collection of r; +72 absolute values exhausts the archimedean absolute 
values on F’,, up to equivalence. 


PROOF. The remarks above show everything except that these r) +2 absolute 
values are mutually inequivalent. To prove this fact, suppose that o and o’ are two 
field maps of F into the same field, R or C, such that x +> |o(x)| is equivalent 
tox t |o’(x)|. Then g = o’o7! is a field isomorphism from image o onto 
image o’ that respects the absolute value, up to a power. It is therefore uniformly 
continuous from image o onto image o’. Consequently g extends to all of R or C, 
and the continuous extension respects the field operations. On Q, 9 is the identity, 
and hence its continuous extension to R must be the identity. Thus the continuous 
extension is an automorphism of R or C that fixes R, and consequently it must 
be the identity or complex conjugation. 


It is of some interest to know what archimedean absolute values can occur in 
other situations, besides number fields, and Theorem 6.24 shows that it is enough 
to classify the complete ones. Ostrowski did so, and the result is that IR and C, 
with their ordinary absolute values, are the only complete archimedean fields up 
to equivalence.!! 


' 4 proof of the Ostrowski result may be found in Hasse’s Number Theory, pp. 191-194. Gelfand 


5. Hensel’s Lemma 349 


5. Hensel’s Lemma 


Hensel’s Lemma is a device that in its simplest forms allows one to solve polyno- 
mial equations in the field Q, of p-adic numbers by using congruence information 
modulo some power of p. It has a number of distinct formulations, all of which 
work within any complete nonarchimedean valued field, not limited to Q,. We 
shall give a fairly simple formulation and obtain a handy special case as a corollary, 
using an adaptation of Newton’s method of iterations in calculus for finding roots 
of polynomials. At the end of the section, we shall state without proof a version of 
Hensel’s Lemma that works to factor polynomials rather than to find their roots. 
Yet another formulation of Hensel’s Lemma, whose precise statement we omit, 
applies to systems of polynomial equations in several variables. 

No overarching result of this chapter actually makes use of any version of 
Hensel’s Lemma. Instead, versions of Hensel’s Lemma are indispensable in 
analyzing the fine structure of complete valued fields and in handling examples. 
Thus the applications of Hensel’s Lemma in this book will occur in the examples of 
this section and the next and also in problems at the end of the chapter. Problem 16 
is one such problem. 


Theorem 6.28 (Hensel’s Lemma). Let F be a field with a nontrivial discrete 
absolute value | - |, necessarily nonarchimedean, and assume that F' is complete. 
Let R be the valuation ring, and let f(X) be a polynomial in R[X]. Suppose that 
do is amember of R such that 


| f (a0)| < If’ @o) 
Then the sequence {a,} recursively given by 


_ f (Qn) 
f'n) 


An+1 = an 


is well defined in R and converges to a root a of f (X) that satisfies |a — ag| < 1. 


PrRoor. Put c = |f(ao)|/|f'(ao)|?, < 1. We prove the following three 
statements together by induction on n: 


(i) Gy is well defined and is in R, 
(ii) [f"@n)| = |f'(@o)| # 0, and 
(ii) |f Gn)I/If' Gn) < eI f"@o)I- 


and Tornheim proved a more general result, with the same conclusion, that allows the multiplicative 
property of absolute values to be relaxed somewhat. A proof of this result appears in Artin’s Theory 
of Algebraic Numbers, pp. 45-51. 


350 VI. Reinterpretation with Adeles and Ideles 


The base case for the induction is the case n = 0, and the three statements are 
true by hypothesis in this case. 

Assume that the three statements hold for n. From (ii), d,4; is defined, and 
then (111) shows that a,,, satisfies 

Gil’) Jang — dnl = | f Gn)I/[f!Gn)| < cf" (ao) |- 
The fact that a, and f’(ao) are in R, in combination with (iii’), shows that a,41 
is in R. This proves (i) for n + 1. 

For (ii) and (iii), we make use of the following Taylor expansions of f (X) and 
f'(X) about b: 


f(X) = f(b) + (X% — b) f(b) + (X - b)?g(X) with g(X) € R[X] 
and 


f'(X) = f(b) + (X¥ — b)h(X) with A(X) € R[X]. 


To check that these expansions are valid in any characteristic, it is enough to 
check the first one, since the second one follows by differentiation. For the first 
one, it is enough to treat the special case X*. Dividing X* — b* by X — b, we see 
that we are to produce g(X) such that 


k-1 k-1 
(X — b)g(X) = Xt DE! Xi — kb! = Ei (Xs — bs), 
j=0 j=0 


Every term on the right side is divisible by X — b, and thus the quotient g(X) is 
in R[X]. 


Put Qn = Gn41 — Gn = —f (Gn)/f' (Gn). By (iii) for n, |On| < |f'n)lc?’ 
in particular, |Q,| < |f’(a,)|. In the expansion of f’(X), we take b = a, and 
evaluate at X = a,+, to obtain 


f' (Qn41) = F(a) + Onh(Gn+1). 
Since |Q,| < |f’(dn)| and |A(an41)| < 1, we see that | f’(an41)| = |f'(an)|- 


This proves (11) for + 1. 
In the expansion of f(X), we take b = a, and evaluate at X = a,+1 to obtain 


F@n41) = f Gn) + (Gn — an) f' (Gn) + (Qn41 — An) 8 (An41)- 


But (dn41 — Gn) f'(an) = —f (an), and hence this equation simplifies to 


St (Qn41) = QO? 2(an41)- 


5. Hensel’s Lemma 351 


Since g(a,+41) is in R, application of (iii) for m and (ii) for n + 1 gives 


If Grol Qnl?1g(@n+1)I x ( If (4n)| y 2 ae" 


If’ @neP fn)? fan)? 


and this proves (iii) for n + 1. This completes the induction. 
Now we can prove the theorem. If n < m, then (iii’) and the ultrametric 
inequality imply that 


’ 


2k 2” 
lam —n| < max |ax+1 —ax| <|f'(ao)| max co <|f'(ao)le". — (*) 
n<k<m n<k<m 


Consequently {a,} is a Cauchy sequence. Let a be its limit. Substituting 
into the definition of a,41, using (ii), and passing to the limit, we obtain a = 
a— f(a)/f'(a). Thus f(a) = 0. Taking n = 0 in (+) and letting m tend to 
infinity gives |a — ao| < | f’(ao)|c, and this is < c < 1 because f’(ao) is in R. 


Corollary 6.29 (Hensel’s Lemma). Let F be a field with a nontrivial discrete 
absolute value, necessarily nonarchimedean, and assume that F is complete. Let 
R be the valuation ring, let p be the unique maximal ideal, and let f(X) be a 
polynomial in R[X]. If f(X) is the reduced polynomial with coefficients in R/p 
and if @ is a simple root of f(X), then f(X) has a simple root a € R whose 
image in R/p is a. 


PROOF. Let ag be any member of R whose image in R/p is a. The assumptions 
imply that f (ao) is in p and that f’(ag) is in R but not p. Thus the hypotheses 
of Theorem 6.28 are satisfied, and the theorem produces a root a of f(X) with 
a—aginp. 


EXAMPLES WITH F = Q, AND R = Zp. 


(1) Suppose that p is an odd prime and that n is an integer for which the 
Legendre symbol G) is +1, i.e., for which GCD(n, p) = 1 and n has a square 
root modulo p. Then n has a square root in Z,. This is immediate from Corollary 
6.29 with f(X) = X? —n. 

(2) Suppose that p = 2 and that n is an integer!” having the form 8k + 1. The 
maximal ideal in Z, is (2). Corollary 6.29 is not applicable to f(X) = X? —n, 
since evaluation of the derivative f’(X) = 2X at any point of Z» leads to a 
member of the ideal (2). However, we can apply Theorem 6.28. Let ag = 1, 
so that f(a9) = 1 —n and f’(ap) = 2. The theorem produces a root a in Zp if 
[1 —n],/|2|5 < lie., if |1 —nl, < Z. Since |1 —n|, =| — 8k, = glkl, < 4, 
the theorem indeed applies. The resulting root a in Zz has a = 1 mod (2). 


Tn fact, n could be a 2-adic integer in this argument. 


352 VI. Reinterpretation with Adeles and Ideles 


(3) Suppose that p > 3. Every nonzero residue @ in Z/pZ has a?-! = 
1 mod p. Corollary 6.29 shows immediately that the polynomial X?~! — 1 has 
a root a whose image in Z,/pZ, is a. Since the elements a are distinct, we 
conclude that Z, contains all p — 1 of the (p — 1)" root of unity. 


(4) As promised at the beginning of Section 4, we show that Q, for p = 5 
is obtained also by completing the field Q(i) with respect to a certain absolute 
value and that in fact there are two distinct equivalence classes of absolute values 
on Q(i) for which Qs results. Thus let F = Q, K = Q(i), and p = (5). The 
prime factorization of (5)Z[i] is as (2 + 7)(2 —i). If we put P; = (2 +7) and 
Py = (2 — i), then Kp, and Kp, are both equal to Qs because Example | above 
shows that the square roots of —1 already appear in Qs. If a is one of the square 
roots, then |2+al, |2—a, -_ |(2+a)(2—a)|, = |5|, = i. Thus one of |2+a], 


and |2 = a 5 equals i and the other equals 1. What is happening is that there are 
two field mappings Q(i) — Qs. For each of them, the effect on the base field Q 
is the same; however, one field mapping sends 7 in Q(Z) to a in Qs, and the other 
sends i to —a. For definiteness, let us say that |2 + al, = i. Then the valuation 
of Q(i) with respect to P; = (2 +7) is consistent with the 5-adic valuation of Qs, 
but the valuation of P2 = (2 — 7) is not. This example shows why the definition 
of completion insists on a mapping of valued fields (respecting absolute values), 
not merely a mapping of fields. 


(5) Suppose that p = 2. The question is the prime factorization of f(X) = 
X3+4X?—2X +8 in Z). This polynomial was studied at length toward the end of 
Section V.4 in connection with common index divisors. It is irreducible over Q, 
but we are to factor it over Q2. We shall show that it splits into first-degree factors. 
Considering the polynomial modulo 2, we find that f(X) = (X — 1)X? mod 2. 
Since | is a simple root modulo 2, Corollary 6.29 says that there exists an element 
6; in Z2 such that f (6) = 0 and 6; = 1 mod 2. Dividing f(X) by X — 4, we 
obtain 

f (X) = (X — 6)(X? + @ + DX + GG + 1) —2)). 


To show that the quadratic factor splits over Qs, it is necessary and sufficient 
to show that its discriminant is a square, since Q> has characteristic 0. The 
discriminant is 


(1 +1) —4@@ +1) —2)=4(G@+ DY -@@+1)—2), 
and we can ignore the square factor of 4. We know that 6; = 1 mod 2. Let us 
compute 6; modulo 8Z, by writing 6; = 8g + c with g € Zp, and withc = +1 


or +3. Substituting into f(X) and computing modulo 8Z2, we have 


0 = f(0;) =c +c? — 2c mod 8Zp. 


6. Ramification Indices and Residue Class Degrees 353 


Since c is odd, c? = c andc? = 1 mod8. Thus 0 = c+ 1 — 2c mod 8 and 
c = 1 mod 8. Consequently 


(£@ + 1)? — 11 + 1) — 2) = 1 mod 8. 


By Example 2 any 2-adic integer that is = 1 mod 8Z, is a square in Zp, and thus 
J (X) indeed factors over Zp as the product of three first-degree factors. 


We conclude this section with a version of Hensel’s Lemma that we state 
without proof.'? This version deals with factorizations rather than roots. Briefly 
it says that we can lift a relatively prime factorization modulo p to a factorization 
in R[X] if at least one of the two factors modulo p has leading coefficient 1. This 
theorem certainly implies Corollary 6.29. 


Theorem 6.30 (Hensel’s Lemma). Let F be a field with a nontrivial discrete 
absolute value, necessarily nonarchimedean, and assume that F is complete. Let 
R be the valuation ring, let p be the unique maximal ideal, let k be the residue 
class field, and let f(X) be a polynomial in R[X]. Suppose that there exist 
polynomials go(X) and ho(X) in R[X] such that go(X) mod p and ho(X) mod p 
are relatively prime in k[X], go has leading coefficient 1, and f(X) factors modulo 
pas f(X) = go(X)ho(X) mod p. Then there exist polynomials g(X) and h(X) 
in R[X] such that g(X) has leading coefficient 1, g(X) = go(X) mod p, A(X) = 
ho(X) mod p, and f(X) factors in R(X) as f(X) = g(X)A(X). 


6. Ramification Indices and Residue Class Degrees 


Sections 1-4 have presented the ingredients of a two-stage process for analyzing 
congruence information, and now it is time to use everything together. The goal 
is to have techniques for extracting information about a global number-theoretic 
problem by seeing what the problem says about ideals, for reducing the questions 
about ideals to questions about powers of prime ideals, and for then assembling 
the results. 

We give one illustration of the utility of our constructions: With the techniques 
we had in Chapter V, we gave only a partial proof of the Dedekind Discriminant 
Theorem (Theorem 5.5). By contrast, we shall see in Section 8 that the present 
techniques lead naturally to a complete proof. 

Although we might want to work just within one number field, it is helpful to 
change the context so that we are comparing a number field with a finite extension. 
There is no loss of generality in doing so; we can always take the base field to 


'3.4 proof may be found in Hasse’s Number Theory, pp. 169-172. 


354 VI. Reinterpretation with Adeles and Ideles 


be the rationals Q, and the effect is that we consider only the finite set of prime 
ideals for the extension field that contain a given prime number p. 

As long as we are going to consider finite extensions of fields in addressing 
number theory, we might as well treat also the case of function fields in one 
variable, at least to the extent that the two theories are quite analogous. Thus we 
are led to the following set-up. 

Let R be a Dedekind domain considered as a subring of its field of fractions 
F, let K be a finite separable'* extension of F with [K : F] =n, and let T be 
the integral closure of R in K. We shall work with F and K as valued fields, 
having some absolute value on them. The case of interest in this section will be 
that the absolute value is nonarchimedean and arises from a discrete valuation 
whose valuation ring contains R or T, respectively. Theorem 6.5 shows that the 
valuation is defined by means of some prime ideal ¢ of R or T, and the associated 
absolute value may thus be denoted by an expression!> like | - la 

We start from a prime ideal p in R and form the corresponding absolute value 
on F as in Section 3, obtaining a valued field (F, | - |,). Then we complete as in 
Section 4, writing the completion as 


Yo: 1+ Ip) > Gp. 1 - Ip). 


We know that the ideal pT in T has a prime factorization of the form pT = 
Pe Mee. P,? , Where P;,..., P, are distinct prime ideals in T. The integers e; are 
called ramification indices and the dimensions f; = dimp/p(T/P;) are called 
residue class degrees. We are interested in saying everything we can about 
P,,..., P, and about the indices e; and f;. The fundamental relationship is 
given by Theorem 9.60 of Basic Algebra, namely 


ee 


i=1 


We know that each P; gives us a nonarchimedean absolute value | - |p on K, 
unique up to equivalence, and then a completion 


eK [py > Ka lp: 


'4The role of separability will become apparent before the statement of Theorem 6.31 below. 

'SThe number-theory case ultimately requires also a limited amount of analysis of archimedean 
absolute values, and that will be carried out in Section 9. In the context of passing from a Diophantine 
equation to congruence information, part of the role that archimedean absolute values play is in 
analyzing signs. Thus for example the simple-minded equation x? + y? = —1 has no solutions in 
integers; the reason for the absence of solutions is a constraint on signs, not some limitation from 
congruences with respect to powers of primes. Archimedean absolute values control signs. 


6. Ramification Indices and Residue Class Degrees 355 


The first important step is to establish an isomorphism involving fields such 
that the identity )°%_, e; f; = n is a dimension formula that follows from the 
isomorphism. The identity in question concerns the ring K @f Fy, which is 
a commutative algebra over K or over Fy, whichever we like, and which is 
semisimple by Corollary 2.30 under our assumption that K is a finite separable 
extension of F. The Wedderburn theory (Theorems 2.2 and 2.4) shows that 
K ®F Fy is isomorphic to a finite direct product of fields,!° each of which is a 
finite extension of Fy. What we shall prove later in this section is the following 
theorem. 


Theorem 6.31. Let R be a Dedekind domain considered as a subring of its 
field of fractions F, let K be a finite separable extension of F with [K : F] =n, 
and let T be the integral closure of R in K. If p is a nonzero prime ideal of R 
and if the ideal pT in T has a prime factorization of the form pT = Py! --- PS 
where P|,..., Pg are distinct prime ideals in T and the e; are positive integers, 
then 


& 
KrF, = I] Kp. 
jel 


When the formula Ss ef) =nis spect to the field extension K p,/ Fp, 
it becomes e; fi = “= ([K p, : Fpl, where e* ‘ and fj * are the ramification ides and 
residue class ‘degree associated to Kp,/Fy. If we accept for the moment the result 
of Lemma 6.36 below that e; and f; * coincide with the corresponding indices e; 
and f; for K/F, thenn = ae Cis ae efi = Ke. 5 : Fy] indeed 
counts the F, dimensions of both sides of the formula K @p Fy = That Kp, 
in the theorem. The theorem says much more than this, and we shall mine its 
consequences after giving the proof of the theorem. 

For orientation, let us recall Example 4 from Section 5. In that example, we had 
R=Z,F =QK =Q(Q),T = Zi], p = 5Z, and Fy = Qs. The factorization 
pT = IT Pe is 5Z[i] = (2 +i)(2 —i), and the two completed versions of K are 
Kasi) = ~ @, and K(2-;) = Qs. Thus the identity in the theorem specializes to 


QW) @g Qs = Qs x Qs. 


Proving the identity on this level would be more challenging than necessary 
because the isomorphism cannot be unique; it can always be composed with 
the interchange of the two factors on the right side. For this reason the proof 
makes use of valued fields, and then in effect the desired isomorphism becomes 
a constructive one that we can write down rather explicitly. 


'6The words “direct product” in connection with finitely many fields refer to the direct sum of 
the additive structures, with multiplication given coordinate by coordinate. 


356 VI. Reinterpretation with Adeles and Ideles 


Let us now work toward proving Theorem 6.31. Above, we mentioned the 
completion mapping Wo for F relative to an absolute value in the equivalence 
class determined by p, as well as w; for K relative to some absolute value in the 
class determined by P;. In addition, we have inclusion mappings corresponding 
to the field extensions K/F and Kp,/F,. Figure 6.1 below is a square diagram 
that assigns the names gp and 4; to these as well. 


FIGURE 6.1. Commutativity of completion and extension as field mappings. 


The diagram in Figure 6.1 commutes. In fact, wjgo and gj are both F 
homomorphisms, being compositions of F homomorphisms, and hence x € F 
implies Yjg(x) = x(hjp(1)) = x(1) = x(VjPo(1)) = GVo(x). 

But more is true: we are going to impose absolute values on the four fields in the 
diagram in such a way that the four field mappings are homomorphisms of valued 
fields. We have already defined | - |, on F as any absolute value corresponding 
to p, and then | - |,, is defined on Fy, in such a way that the completion mapping 
Wo preserves absolute values. Theorem 6.33 below will enable us to define an 
absolute value in a unique fashion on K p, such that g; preserves absolute values. 
Proposition 6.34 will give us the definition of an absolute value on K, and we 
shall check in Lemma 6.35 that Figure 6.1 with these absolute values in place is 
a commutative diagram of valued fields. Finally we use this commutativity to 
prove in Lemma 6.36 that the ramification index e7 and residue class degree f;* 
for Kp, /F, match the corresponding parameters e; and f; for K/F’, and then we 
are ready for the main part of the proof of the theorem. 

We begin our preliminary work by limiting the possibilities for a finite exten- 
sion of a complete valued field (F, | - |,,). If K is a finite extension of F, a norm 
on the F vector space K relative to | - |, is a function || - || from K to R having 

(i) ||x|| => Oon K with equality if and only if x = 0, 
(ii) ||cx|] = |e|p|lx|| force ¢ F andx € K, 
(ii) ||x + yl] < |]x|| + llyll for all x and y in K. 


Lemma 6.32. If (F, | - |,-) isa complete valued field, if K is a finite extension 
of F,, and if || - ||, and || - ||, are any two norms on K relative to | - |,,, then there 
exist real constants C and C’ such that 


xl; <Cllx|l, and |x|], < C’Ilxll, for allx € K. 


Consequently K is Cauchy complete in the metric induced by either norm. 


6. Ramification Indices and Residue Class Degrees 357 


REMARK. It is not important that K be a field in this lemma, only that it be a 
finite-dimensional vector space over F. 


PROOF. Let n = dimr K. Fixing an ordered basis (x1,...,X,) of K over F, 
we may express any member x of K in the form x = )>y_, cix; with all c; in F. 
With the c;’s defined this way, we define ||x Ilsup = MAX1<j<n |ci|p. To prove the 
displayed inequalities, it is enough to prove them for || - || 
|| - ||. For one direction of the inequality, we have 


lel = | 0; cael] SX; Meexill =X; lel elleeell << (OL; lel) lt Mlsup- 


This proves that ||x|| < C|| lly) with C = lll: 
For the reverse inequality we shall prove by induction on & that an inequality 
IX Il sup < C;||x|| holds for all x in the F linear span of at most k of the vec- 


tors x,,...,%X,. The base case for the induction is k = 1, and then IX Isup = 


sup and any other norm 


||x; || ~! |x || whenever x is a multiple of x;. SoC) = max; <j<n(([xj||7!). 
Assume that C/,..., C;, exist and that we are to produce C,,,. Arguing by 

contradiction, we may assume that there is some sequence {x} in K, each term 

having at most k + 1 nonzero coefficients, such that ||x” || = 1 for all m and 


lla || sup tends to infinity. Possibly by passing to a subsequence, we may assume 
that the nonzero coefficients of x” all lie in a particular subset of k + 1 of the 
coefficients, and there is no harm in assuming that this subset is {1,...,k + 1}. 
Passing to a further subsequence, we may assume that there is some index j such 
that the largest coefficient of each x“, when measured by | - |,,, is the j, and 
there is no harm in assuming that j = k + 1. 


Let em, ert on be the coefficients of x”, so that x” = eae cx. Put 
yM = Cay ae = en ae x + Xx41, where ae = Gs Here 
|d””| » <1 for 1 <i <k and for all m, and also lly || = |e) Ip! lx || = 
lore |;' tends to 0. 


For each vector y) —x,41, only the first k coefficients can be nonzero, and the 
same thing is true of differences y"” — y) of two such vectors. The inductive 
hypothesis tells us that ||y“) — y% lip = Clye— y ||, and the right 
side tends to 0 as m and m’ tend to infinity because || y“”” || and || yo || tend to 0. 
Therefore the i" coordinate of y” forms a Cauchy sequence. Since F is given as 
complete, {y“”)} is convergent in the norm || - Ilsup to Some y = ke jx; +Xx41 
in K. 

By the easy direction of our inequality, || y" — y|| < C||y" —y|l sup: Lhe right 
side tends to 0, and hence so does the left. We know that || y”” || tends to 0, and 
hence y = 0. But this conclusion contradicts the form of y as ye djXj; + X41 
with coefficient 1 for xx. We conclude that C; 44 exists as asserted, and the 
lemma follows. 


358 VI. Reinterpretation with Adeles and Ideles 


Theorem 6.33. If (F, | - |,-) is a complete valued field relative to a nontrivial 
nonarchimedean discrete absolute value and if K is a finite separable extension 
of F with [K : F] =n, then K has a unique absolute value | - |, extending 
| - |, K is complete and nonarchimedean, and the integral closure T in K of 
the valuation ring R of F is the valuation ring of K. The extension is given by 


1 
Ixle = IN«ye(iye”. 


REMARKS. Since T is the valuation ring, Proposition 6.2 shows that T has a 
unique nonzero prime ideal. It follows that if p is a nonzero prime ideal of R, 
then pT = P® for a single prime ideal P of T. We shall make frequent use of 
this fact in applications without explicit mention. 


PROOF. For uniqueness, suppose that | - |, and | - |, are two absolute values on 
K that extend | - |,,. Let us see that each of these is anorm on K relative to | - |. 


In fact, what needs checking for | - |, is that the function respects scalars from 
F appropriately. If c is in F and xo is in K, then |cxo|, = |c|,|xol, = lel-lxol,, 
the second equality following because | - |, restricts to | - |, on F. A similar 


argument applies to | - |,, and thus we are dealing with two norms. 

If the two given absolute values are inequivalent, then Proposition 6.12 shows 
in the presence of the nontriviality of | - |, that we can find an x € K with 
|x|, > 1 and |x|, < 1. Then lim; |x~*|, = 0 while |x~*|, > 1 for all k. 
Consequently there cannot exist a constant C such that |y|, < C|y|, for all 
y € F, in contradiction to Lemma 6.32. 


We conclude that | - |, and | - |, are equivalent, say that |x|, = |x|; for all 
x € K andsome s > 0. Since | - |, is nontrivial, there exists some x9 € F 
with |xo|, > 1. The equality |xo|, = |xo|5 then implies that s = 1. This proves 
uniqueness. 

We turn to existence. Proposition 6.2 shows that the valuation ring R in F for 
the discrete valuation vy corresponding to | - |, on F is a local principal ideal 


domain and that the valuation ideal p is the unique maximal ideal of R. Theorem 
6.5 shows that the valuation v, determined by p is the same as the given valuation 
up. Hence | - |, is given for alla ¢€ F by |a|, = r—’»® for somer > 1. Let x 
be a generator of the principal ideal p of R. 

Since K/F is finite and separable, Theorem 8.54 of Basic Algebra shows that 
the integral closure T of R in K is a Dedekind domain. Let pT = Pf! --- Py’ 
be the factorization of the ideal pT of T into the product of powers of distinct 
prime ideals of T. Each P; defines a nonarchimedean valuation up, of K. Ifa 
is any element of F, then we can write a = 2*u for some u € R* and some 
integer k. The computation aT = aRT = n*uRT = x*RT =n*T = p'T = 


k 
pre vee Pe shows that vp(a) = k and that vp, (a) = ke;. Hence vp, = ejUp on 


=i a 
F,, and therefore the formula |x| P= Cage ™ for x € K defines an absolute 


6. Ramification Indices and Residue Class Degrees 359 


al 


= a1, 
value on K that has lal, = r7? © =r79 °° = (ro 


= |a| P, for all 
ain F. This proves existence. The absolute value | - | p, on K is complete by 
Lemma 6.32 and is nonarchimedean because it is given by a discrete valuation. 

Let us show that g = 1. Arguing by contradiction, suppose that there are at 
least two distinct prime ideals P; and P, of T that contain p. Since P; + P; = T, 
we can choose x; € P; and x2 € P) with x} + x2 = 1. Then vp,(x;) > 0 and 
vp, (1) = 0, from which we see that vp, (x2) = 0. Since up, (x2) > 0, we obtain 
a contradiction to the uniqueness part of the theorem. Thus the prime ideal of T 
is unique. Let us write P for this ideal. 

We know that vp(T) > 0, i.e., that T is contained in the valuation ring of up. 
Proposition 6.4 shows that the valuation ring of vp equals S~'T,, where S is the 
complement of P in T. The uniqueness of P means that T is local, and hence 
every member of S is a unit in 7. Thus S~'T = T, and T is the valuation ring. 

Write | - |, in place of | - | Py To prove the explicit formula for | - |, in 
the statement of the proposition, choose a finite Galois extension L of F that 
contains K; such a field L exists because K /F is separable.'!’ By the existence 
just proved, let | - |, be an extension of | - |, to L. Ifo is in Gal(L/F), then 
x > |o(x)|, and x +> |x|, are both absolute values on L that extend | - |,. By 
the uniqueness just proved, |o(x)|, = |x|,. Applying | - |, to both sides of the 
formula Nz F(x) = Haoecaz/F) a(x) gives 

IMjrOlp =INer@Ol,= TI le@l,= lle. (*) 
o €Gal(L/F) 


If x is in K, then the left side equals (|Nx/r(x)|,-)4*!, and the right side equals 


(|, ERIK] = (jx|/*)) UK], Thus the desired formula follows by extracting 
the positive [L : K]" root of both sides of («). 


Proposition 6.34. Under the hypotheses of Theorem 6.31, let vp be the 
valuation of F defined by p, and let vp, be the valuation of K defined by P;, 


1<j <g. Thenejvy = vp,| p- Consequently if | - |,, is an absolute value on 


lp 
F defined by p, then for each 7 some member | - | P, of the equivalence class 


of absolute values defined on K by P; is an extension of | - In this case the 


Ip: 
inclusion of (F, | - |,) into (K,| - | P, ) is ahomomorphism of valued fields. 


PROOF. Let S be the multiplicative system in R given as the set-theoretic 
complement of p in R. For the first conclusion Proposition 6.4 and Theorem 6.5 
together show that it is enough to prove that 


Cj Us-|p => Us-l p; EF: («) 


'7The field L can be taken to be a splitting field of the minimal polynomial over F of an element 
€ such that K = F(&). The extension L/F is separable by Corollary 9.30 of Basic Algebra. 


360 VI. Reinterpretation with Adeles and Ideles 


From the identity 
PSP Pk 


we have 
S pT = (S' Pi)" s+ (SP). (4) 


Since S is the complement of p in R, vp is 0 on S. Hence vs-ip is 0 on S. From 
ROP; =p, we have $M P;} C SNp = S. Thus the members of S lie in R C T 
but in no P;, and vp, is 0 on S. Hence vs-1p, is 0 on S. 

Let zr be a generator of the principal ideal S~'p in S~'R, so that Us-ip (7) = 1. 
Since 1 S~'T = S~'pT, equation («*) shows that Us-1 p, (77) = e;. Each element 
y of F is of the form y = z*u for some integer k and some u € F with 
Us-1p(u) = 0. The element u must be in S-!R but not S7!p and hence is in 
S~!. Thus Us-1 p,(u) = 0. We have now seen that Us-1 p(X) = ejUs-1p(X) for the 
element x = u above and also for x = w. Therefore vs-1 p(x) = ej Us-1p(x) for 

. J ’ 
all x € F, and (+) is proved. 

Now that e; | p> Choose r > 1 such that |x|, = r—'e© for x € F. 
If r’ is defined by r = (r’)*/, then the definition Ixlp, = (r’) forx eK 
restricts for. x € F to |x|p, = (r’) PF) = (r’)~P@D) = pO = |x|, and the 
inclusion is indeed a homomorphism of valued fields. 


With these facts in place, let us make Figure 6.1 into a commutative diagram 
of valued fields. From p, we use any corresponding choice of | - |, on F, and 
this uniquely determines an absolute value by the same name on Fy. Next we 
apply Theorem 6.33 to the inclusion g; : Fy > Kp, to obtain a unique extension 
of | - |p from F, to an absolute value | - Ip, on Kp.. 

Meanwhile, with the index j specified, Proposition 6.34 gives us a unique 
absolute value | - | P; on K such that the inclusion gp : F — K isahomomorphism 
of valued fields. The completion mapping yj : K — Kp, in turn gives us a 
second determination of | - | p, on Kp,, and Lemma 6.35 below says that these 
two determinations match, i.e., that Figure 6.2 is a commutative diagram of 
homomorphisms of valued fields. 


ely Ss Ga 


eo |e 


(Ky| 1p) > (Keyl > |p) 


FIGURE 6.2. Commutativity of completion and extension 
as homomorphisms of valued fields. 


6. Ramification Indices and Residue Class Degrees 361 


Lemma 6.35. In the above notation the two determinations of | - |p, on Kp, 
coincide—one by using Theorem 6.33 to insist that gj in Figure 6.2 be the 
composition of homomorphisms of valued fields, and the other by using Proposi- 
tion 6.34 to insist that w;@o in Figure 6.2 be the composition of homomorphisms 
of valued fields. 


REMARKS. The commutativity formula y;¢ = Wo for field mappings is 
known from the discussion concerning Figure 6.1. 


PROOF. Let us give two different names to the two possible absolute values on 
Kp,, writing | - |’ for the one that makes |y;(k)|’ = IKI p, fork € K and writing 
| - |" for the other, which makes |g; (x)|'" = |x|, forx € Fy. Let y bein F. Then 
the equality yo = jG implies that 


lei Wo)’ = lWigo(y)! = lop, = lYlp = WoO) p- (*) 


If xo is given in Fy, then we can choose a sequence {x,} in F with {Wo(xn)} 
convergent to xo in Fy. Then {o(x,)} is Cauchy in the metric on Fy, and 
it follows from (*) applied with y = x, — x, that {gj Wo(%,)} is Cauchy in 
the metric from | - |’ on K p,- If we have a second such sequence {x/} in F 
with Wo(x/,) convergent to xo and if we alternate the terms of {x,} and {x,} to 
produce a sequence {z,}, then {gj Yo(Zn)} remains Cauchy in the metric from | - |’. 
Since | - |’ is complete, it follows that |g;(xo)|' is given by a well-defined limit 
independently of the sequence in Wo(F) used to approximate x9. The formula 
(«) shows that |g; (xo) |’ = Ixoly» and the definition of | - | shows that this equals 
|; (xo)|". By the uniqueness in Theorem 6.33, | - |’ =| - |” on Kp. 


Lemma 6.36. In the above notation and that of Theorem 6.31, the ramification 
index e; corresponding to Kp, /F, for the closure of the ideal ;(P;) coincides 
with the ramification index e; corresponding to K /F for the ideal P;. 


REMARK. In addition, the residue class degree Ee for K p;/Fp coincides with 
the residue class degree f; for K/F. In fact, the five paragraphs of review that 
follow Theorem 6.26 mention that residue class fields change neither during the 
localization step nor in the completion step of our two-step process. Thus R/p 
remains the same during the two steps, and so does T/P;. Hence the dimension 
of T/P; as a vector space over R/p remains the same. 


PROOF. Let Up F, UP,,K, Up, Fy > and v Pike be the valuations corresponding to 
the absolute values on F, K, Fy, and K Ps respectively. The last of these is well 
defined by Lemma 6.35. Proposition 6.34 shows that 


* 
ejUpF =UP.KGO aNd EF Vp. F, = UP.Kp, G- (*) 


362 VI. Reinterpretation with Adeles and Ideles 
Meanwhile, the completion mappings wo and yy; satisfy 
Up, Fy WO = Up, F and UP). Kp, i = UP;,K- (2) 


Multiplying the second equation of (*) on the right by Wo and substituting from 
the first equation of (««), we obtain 


* * 
C7] Up. F = Cj Up. Fy WO = UP). Kp, Pi W0- 


We substitute from the commutativity formula g; wo = wj¢o and unwind the right 
side as 


UP: Kp, VjP0 = UP), KPO = CjUp.F- 


* 
j 


Thus e7'vp,p = ejUp,r. Since Up, is not identically 0, we obtain e7 = e;. 
PROOF OF THEOREM 6.31. As was mentioned before the statement of the 
theorem, it follows from Proposition 2.29 and the Wedderburn theory that K @F Fp 


is isomorphic to a product Te L; of fields, each of which is a finite extension 
of F, and each of which has K embedded in it. The subfields L; are uniquely 
determined within K @ - Fy, and we let n; be the projection of K @r Fy onto 
L;. Each nj is a ring homomorphism and is given by multiplication by a specific 
element of K @y Fy, namely the element that is | in the i" position and is 0 in 
the other positions. When restricted to K @ 1, n; gives a field map a; : K > L;; 
when restricted to 1 ® Fy, it gives a field map f; : Fy > Lj. 

We shall develop a small abstract theory about these field maps a; and f;. 
Suppose that M is a field containing F, thata : K + M and Bp: Fy — M are 
F algebra homomorphisms, and that M is a finite separable extension of B(F,). 
Theorem 6.33 says that M has a unique absolute value | - |p g extending | - |, 
and that the valued field (M, | - |p.) is complete. The extension property means 
that B : (Fy,| - lp) + (VM, | - |p,g) is a homomorphism of valued fields. The 
restriction a@*(| - |p,g) to K makes (K,a*(| - |p,g)) into a valued field in such a 
way that 


a: (K,oa*(| + |p,p)) > (M1 > Ip.p) (*) 


is ahomomorphism of valued fields. Let us see that 
a*(| + |p,g) is one (and only one) of the absolute values | - Ip, on K(x) 
and that @ in () factors as the composition of the completion mapping 


Wj: (K,1 + Ip) > (Ke, 1 > Ip) 


6. Ramification Indices and Residue Class Degrees 363 
followed by some other homomorphism of valued fields 
L:(Kp,,| > |p) > (M,| - Ip.p)- 
To get at (**) and the factorization of a, let us show that the field mapping 


go: (F,| + |p) > (K,@*(| + |p,s)) (1) 


is a homomorphism of valued fields, i.e., that ¢ja*(| - |p.g) =| - |p. The field 
mappings ag and By, which carry F into M via K and Fy, respectively, are 
compositions of F homomorphisms and hence are F homomorphisms. Therefore 
x € F implies that ago(x) = x(ago(1)) = x1) = x(BWo(1)) = BWo(x), and 
we see that agp = BY on F. For x ¢€ F, this identity accounts for the third 
equality in the following computation proving (+): 


Lely = lWoxly = Bory p 
= |agoxly p = (I + |p.6)(Gox) = Ghor*(| - Ip,p)). 


Returning to (**) and applying (}), we see that w*(| - |p g) is < 1 on R. Since 
T is the integral closure of R, Proposition 6.20 shows that a*(| - |p.g) is < 1 
on T and that it arises from some nonzero prime ideal of 7, necessarily one of 
the ideals P;,..., P,. This proves (**). Then the factorization () follows from 
(*«) and the universal mapping property of completions as given in Theorem 
6.25, since (M, | - |p,g) is complete. 

Now let us specialize by taking M = L; with i fixed. As in the first 
paragraph of the proof, the projection n; : K @r Fy — L; gives us field mappings 
a; : K —> L; and B; : Fy > L; by composing n; with K > K @ | and 
with F, > 1@ F,. If u,..., up, is a vector-space basis of K over F, then 
u; @1,..., uu, ® | is a vector-space basis of K @p Fy over Fy, and it follows 
that L; is finite-dimensional over Fy. Let us check that L; is separable over Fy. 
We are given that K is separable over F, hence that K = F(&) for an element 
— whose minimal polynomial g(X) over F is separable. Then € @ 1 is a root of 
g(X) regarded as in Fy[X], and so is n;(€§ ®@ 1). Therefore L;/Fy is separable, 
and the above theory is applicable. In the theory, L; acquires an absolute value 


| - |p,g, Such that B; : (Fp, | - lp) > (Li, | + |p,g,) 8 ahomomorphism of valued 
fields, and then (L;,| - |p,g,) is complete. The theory produces a unique index 
Jj = ji) making a; : (K,| - |p.) > (Li,| - |p.g;) into a homomorphism of 
valued fields. 


Let us see that a; (K) is dense in L;. Every member of L; is the image under n; 
of some member ee uy @ cj Of K @F Fy with each c; in Fy. The computation 


niu) @ ci) = ni (uz ® 1H @ cy) = aj (uy) Bi (cy) 


364 VI. Reinterpretation with Adeles and Ideles 


shows that every member of L; is of the form pie a; (uj) Bi (c;). Since F is dense 
in F,, we can choose members c; of F as close as we please to c;. Since B; is 
isometric, )~/_, a (u;) B; (c1) is then close to )77_, a (uy) Bi (c)) = Doy_ @i (Cj). 
Consequently a; (K) is indeed dense in Lj. 

Recall in connection with (*) that a; : K — L; factors as a composition of 
homomorphisms of valued fields, namely as yj : (K,| + | p)) > (K Pis | - | p)) 
followed bye: (Kp,,| - Ip) —> (Li,| + |p,g,). Since Kp, is complete, «(Kp,) 
is closed in L;. The dense image a;(K) = (y;(K)) in L; is contained in the 
closed subset 1 (K p,), and it follows that is onto L;. That is, the homomorphism 
of valued fields 

0: (Kel |p) > Lisl + Ip.a) 


is an isomorphism. This identifies the valued field (L;, | - |p,g,) aS isomorphic to 
(Kp,.1 + |p): 

As a consequence of the argument thus far, we have constructed a choice-free 
function i +> j(i) carrying {1,..., g’} into {1,..., g}. The function has the 
property that K p,,. is isomorphic as a valued field to L; for each i. We are going 
to show thati +» j(i) is onto {1,..., g}. Thus let the completion homomorphism 
Wy (KI - Ip,) =F (Kp | + Ip,) be given. 

The F bilinear mapping (Wj, gj) : K x Fy > Kp, given by multiplication 
has a linear extension 


Wj @ 9): K @r Fy > Kp, 


that is a ring homomorphism. The range Kp, 1s a field that is finite-dimensional 
over gj (Fp), and the image of yy; @ gj; is a pj (Fp) vector subspace of K p, that is 
closed under multiplication. Consequently the image of yy; ® g; is closed under 
inverses!® and is a field. The kernel of w; ® g; is therefore a maximal ideal, and 
it follows that there exists some i such that yy; ® g; factors as a composition of 
ni: K @r Fy — L; followed by a field map y : Li > Kp. 

Having constructed a particular L;, let us form @;, f;, and Pj; as in the abstract 
theory with M. The map f; : (Fp, | - Ls) > hs | lp.p:) is a homomorphism of 
valued fields such that yf; = y;, and the map q; : (K, |, - Riad => (L;,|- lps 
is a homomorphism of valued fields such that ya; = y;. The existence part of 


Theorem 6.33 shows that there exists an absolute value | - |, on Kp, such that 
y: (Li,|- lp.p:) => (Bop Lp) is a homomorphism of valued fields. Since 
G+ Ip) =1> lp = BFC Ip.) = BY" - |) = FC» L,)), the uniqueness 


'8The same argument applies here with F p as was used in Section 4 with R: within a field if a 
nonzero element is algebraic over a base field, then the smallest ring containing the base field and 
the element contains also the inverse of the element. 


6. Ramification Indices and Residue Class Degrees 365 


part of Theorem 6.33 shows that | - L=l-| pon Kp,. Meanwhile, the equality 
Ww; = ya; implies that y= =afy*. Then we have 


(I - In, on K) = ¥f(I- Ip, on Kp) 


= ary*(| : Ip, on Kp,) since Wr =ary* 
=a;y*(| - |, on Kp) since | - |, =| + |p, 
ea @;'(| * |p,g, 00 Ly) 

= ([,- 1p) on K). 


Therefore j = j(i), and the mapi +> j(Z) is onto. 

To complete the proof, let us compute dimensions relative to F,, starting 
from the decomposition into fields L;. The ramification index e? and the residue 
class degree f;* for the valuation ring and ideal of Kp, equal the corresponding 
parameters e; and fj for T and P;, by Lemma 6.36. Thus we have 


, g 


& 
n= o> dim,, L; = Se dim, Kp = ys y dim, Kp 
i=l i=l j=lj@O=i 


g 
= 2 Sati = bs 2 e@fi@ = x i | JO=Hl ef. 


J=1ji@=i Lj@=j 


On the other hand, we know that n = Si e; fj, and we have just proved that 
{i | 7@) = j}| = 1 for each 7. It follows that |{i | 7@ = j}| = 1 for each 
j, 1e., that the function i +» j(i) is one-one onto. In particular, g’ = g. The 
theorem follows. 


Notationally what is happening in the proof of the theorem is that a function 
i +> ji) is constructed such that a; : K — L; factors as a; = tj) for some 
canonical isomorphism / : Kp,,, > L; of complete valued fields. Renumbering 
the factors and ignoring canonical isomorphisms, we find that K @F F’, is the 
direct product of the factors Kp, and that a; = w; carries K to K @ 1| and then 
to the i factor Kp,. Any linear mapping of the form A @ | in effect is therefore 
block diagonal with each block corresponding to the effect on some K p,. 

Let us apply these considerations to operations “left-multiplication-by,’” which 
we write as /(-). If € is a member of K, the characteristic polynomial of /(&) 
over F is det(X1 — /(&)), and the characteristic polynomial of /(€) @ 1 over Fy, 
is still det(X 1 —/(&)), but now with its coefficients from F regarded as members 
of Fp via the inclusion yo: F > Fy. 

The linear function X(1 ® 1) — 1/(€) @ 1 is block diagonal, equal to 
X1 —1(y;(E)) on the i™ block for 1 < i < g. The characteristic polynomial 


366 VI. Reinterpretation with Adeles and Ideles 


det(X1 — 1(&)), regarded as having coefficients in F,, is therefore the product 
of the g characteristic polynomials X1 — /(w;(€)), each with coefficients in Fy. 
In turn, this product formula yields a sum formula for the trace Trx/-(§) and 
a product formula for the norm Nx/r(&). If & is a primitive element for the 
extension K /F,, then we can say even more. Let us write all these consequences 
as a corollary. 


Corollary 6.37. Let R be a Dedekind domain regarded as a subring of its field 
of fractions F, let K be a finite separable extension of F with [K : F] =n, and 
let T be the integral closure of R in K. Let p be a nonzero prime ideal of R, and 
let the ideal pT in T have a prime factorization of the form pT = P;'--- PS ; 
where P;,..., P, are distinct prime ideals in T and e), ..., @g are positive. For 
1<i<g,let fj =[T/P; : R/p]. If é is any element of K, then 

(a) the F linear map /(€) on K given by left multiplication by & has the 
property that its field polynomial det(X —/(€)) over F’, when reinterpreted 
as having coefficients in F,, factors over F, as the product 


& 
det(X —1(&)) = | | det(x —1(&))) 


i=] 


of the g field polynomials of the images & = w;(&) under the completion 
map wi: K — Kp, 
(b) Nx/r(&) = TTF) Nxz,/r, Gi), 
(c) Trxjr(€) = Dj) Tre, sr, Gi): 
Furthermore, if § and F together generate K, if m(X) is the minimal polynomial 
of € over F, and if m(X) = nea mj(X) expresses m(X) as the product of 
distinct monic irreducible polynomials in F,[X], then 
@) sg =g, 
(e) there is a one-one onto function i + k(i) on the set {1,..., g} such that 
Kp, is isomorphic as a field to Fy[X]/(myiy(X)), 
(f) deg mg (X) = @; fi. 


PROOF. Conclusion (a) was proved in the paragraph before the statement of 
the corollary, and (b) and (c) follow immediately from (a). 

Under the assumption that K = F(&), the minimal polynomial m(X) of 
€ and the characteristic polynomial det(X1 — /(€)) are equal; thus m(X) = 
det(X — /(&)) is irreducible over F. Applying Proposition 2.29a, we see that 
K @p Fy = Fy[X]/(m(X)) as an Fy algebra. The assumed separability of K/F 
means that m(X) is a separable polynomial, and m(X) therefore factors over the 
extension field F, of F as a product of distinct monic irreducible polynomials 


6. Ramification Indices and Residue Class Degrees 367 


in F,[X], say as m(X) = m,(X)---mg(X). The Chinese Remainder Theorem 
implies that 

gi 

K ®r Fy =| | FplX1/0ni(X)), 

i=l 
and each F'y[X]/(m;(X)) is a field. The factors on the right must coincide with 
the factors in Theorem 6.31, and it follows that g’ = g and that each Kp, is of 
the form F,[X]/(m;(X)) for some k = k(i). This proves (d) and (e). For (f), 
deg mj) (X) is the product of the ramification index and the residue class degree 
for Kp,/Fp, and this product equals e; f; as a consequence of Lemma 6.36 and 
its remark. 


A by-product of (d) is that we obtain a way of computing g for the extension: 
it is the number of irreducible factors into which m(X) splits when it is factored 
over Fy, instead of F. Hensel’s Lemma in the form of Theorem 6.30 can help 
with carrying out this factorization in favorable cases if € is chosen to be integral 
over R, i.e., to be in T. Namely we reduce the coefficients of m(X) modulo 
p, obtaining a monic polynomial in (R/p)[X], and we factor this polynomial!” 
as a product of powers of distinct primes in (R/p)[X]. Since the powers of 
distinct primes are relatively prime and since everything is monic, Theorem 6.30 
is applicable and allows us to lift the factorization to Fy[X]. The resulting monic 
factors in F,[X] may not be irreducible in unfavorable circumstances,” but we 
have at least made progress. 

Theorem 6.31 has accomplished even more than is stated in Corollary 6.37. 
For each /, it has identified a field extension, namely K p, / Fy, in which the indices 
e; and f; are isolated from the other e;’s and f;’s. Under an additional hypothesis 
on the residue class field (it is enough to assume that the residue class field is 
finite), Proposition 6.38 below shows that it is possible to interpolate a unique 
intermediate field L with F, C L C Kp, such that the residue class degree (the 
parameter f) of Kp,/L is | and the ramification index (the parameter e) of K / F, 
is 1. Thus the proposition says that we can separate e; and f; from each other. 
One says that K p,/L is totally ramified and L/F, is unramified. 


Proposition 6.38. Let F be a complete valued field under a nonarchimedean 
discrete valuation v, let R and p be the valuation ring and valuation ideal for v, let 
K bea finite separable extension of F of degree n, let T be the integral closure of 
R in K, and let P be the unique maximal ideal in T as in Theorem 6.33. Suppose 


!9On a computer, for example, if R’/p is finite. 

20Tn Example 5 in the previous section, the given polynomial in Z[X]ism(X) = X34+X?-2X+8, 
and the reduced polynomial in F2[X] is X 2(X +1). Theorem 6.30 exhibits a factorization of m(X) 
over Z2[X] as the product of a linear factor and a quadratic factor, and we saw in Example 5 of 
Section 5 that the quadratic factor is reducible over Z2[X]. 


368 VI. Reinterpretation with Adeles and Ideles 


that R/p is a finite field. Let e be the integer such that pT = P°, and let f be 
the dimension of T/P over R/p. Then there exists a unique intermediate field L 
for which the integral closure U of R in L and the unique maximal ideal go in U 
have the following properties: 

(a) pU = gand wT = P*, 

(b) [U/@ : R/p] = f and [T/P : U/p] = 1. 


The proof is carried out in Problems 15-16 at the end of the chapter. We shall 
apply Proposition 6.38 in Section 8. The intermediate field LZ in the proposition 
is called the inertia subfield of K/F. 

Once this separation of an extension of a complete valued field into a totally 
ramified extension and an unramified extension has been accomplished, one can 
go on to study each kind of extension separately, in order to find out what kind 
of ramification is possible. The results are stated as Lemmas 6.47 and 6.48, and 
proofs are carried out in Problems 17-19 at the end of the chapter. 


7. Special Features of Galois Extensions 


In this section we analyze what happens in the setting of Theorem 6.31 when 
the extension of fields is a Galois extension. For simplicity for the moment, let 
us work with the number-field setting, even though analogous results hold for 
function fields in one variable as well. Thus let K /F be a finite Galois extension 
of number fields, let T and R be the rings of algebraic integers in K and F 
respectively, and let p be a nonzero prime ideal in R. Since the extension K/F 
is Galois, the Galois group Gal(K /F’) permutes transitively the nonzero prime 
ideals containing pT, and the factorization of pT into powers of distinct prime 
ideals of T takes the special form pT = PY --- P¢ with all the exponents the 
same.”! In addition, the dimension of each finite field T/P; over R/p is an 
integer f independent of i, and we have efg =[K: F]. 

Let us review Theorem 9.64 and its surrounding discussion in Basic Alge- 
bra. If we write P for one of the ideals P;, then the subgroup Gp of G = 
Gal(K /F) is called the decomposition group at P. Each o € Gp descends 
to an automorphism o of 7/P that fixes R/p, thereby yielding a member of 
G = Gal((T/P)/(R/p)). The map G — G is certainly a homomorphism, 
and Theorem 9.64 of Basic Algebra says that it is onto. It follows that this 
homomorphism is e-to-1. In Basic Algebra this homomorphism was of interest 
when F = Q and e = 1, since it ensures the presence of certain kinds of 
permutations in G and makes it possible to determine G completely in certain 
circumstances. 


?!1emma 9.61 and Theorem 9.62 of Basic Algebra. 


7. Special Features of Galois Extensions 369 


Theorem 6.31 allows us to isolate each prime ideal P in such an analysis, 
reinterpreting everything in the context of a particular p-adic field. Carrying 
through this process gives insights into the decomposition group and the nature 
of the homomorphism Gp — G. The point of this section is to explain some of 
these insights. 

We work within the setting of Theorem 6.31 except that we assume that the 
residue class fields are finite fields, as they are in the number-theory context. Thus 
let R be a Dedekind domain regarded as a subring of its field of fractions F, let 
K bea finite Galois extension of F with [K : F] =n, and let T be the integral 
closure of R in K. We suppose that p is a nonzero prime ideal of R and that R/p 
is a finite field. Let pT = Py --- P¢ be the prime factorization of the ideal pT 
in T; here P|,..., Py are assumed to be distinct prime ideals in T. Let f be the 
common value of the dimension of T/P; over R/P. 

In the decomposition K @r Fy = []7_, Kp, of Theorem 6.31, the projection 
ni to the i factor on the right side is a member of K @p Fy; specifically it is the 
member of the direct product whose i" coordinate is the multiplicative identity 
of Kp, and whose other coordinates are 0. The element 7; is an idempotent in 
the sense that ie = nj;, and the n;’s are orthogonal in the sense that nn; = 0 for 
i # j. The only idempotents of K @- Fy are the sums of distinct elements 7;, 
and the n;’s are distinguished from the other idempotents in being primitive: 7; 
is not the sum of two nonzero orthogonal idempotents. 

Recall the relationship derived in the proof of Theorem 6.31 between P; and 
the element n;: the mapping B; : Fy > Kp, given by B(x) = (1 @ x)n; for 
x € Fy isa homomorphism of valued fields, and so is the mapping a; : K > Kp, 
given by a;(k) = (k ® 1)n; fork € K. These facts uniquely determine P; from 
among the ideals P;,..., Py. 

We extend the action by each member o of G = Gal(K/F) to K @F Fy as the 
transformation o @ 1. Then G acts on K @- Fy, manifestly keeping each element 
of Fy fixed. Since the members of G respect multiplication and addition, they 
map idempotents to idempotents in K @- Fy, sending primitive idempotents to 
primitive idempotents. Thus G permutes the elements n;. The elements x with 
nix = x are exactly the members of K p,, and hence G permutes the fields K p,. 


Lemma 6.39. In the above setting with K / F Galois, let P; be one of the ideals 
P,,..., P,. Then a member o of the Galois group G = Gal(K’/F’) extends to a 
field automorphism of K p, fixing Fy if and only if it is an isometry of (K, | - |p), 
i.e., if and only if o satisfies |ox|p = |x|p forallx € K. 


PROOF. Ifo is an isometry from K into itself in the metric determined by | - | p., 
then o is uniformly continuous as a function from K into the complete space K p, 
and therefore extends to a continuous function from the completion K p, into Kp,. 


370 VI. Reinterpretation with Adeles and Ideles 


It follows from the continuity of the extension and the fact that o respects the 
operations on K that o respects the operations on K p,. These remarks apply also 
to the extension of o~!, and the extension of o~! is a two-sided inverse to the 
extension of o. Since o is the identity on F, the continuity forces the extension 
of o to be the identity on Fy. 

Conversely suppose that o extends to an automorphism of K p, fixing Fy. Let 
us use the name o also for the extension. On Kp,, the functions x +> |x|», and 
x +> |o(x)|p, are absolute values that extend | - |, on Fy. Theorem 6.33 shows 
that they must be equal, and therefore o is an isometry. 


Proposition 6.40. In the above setting with K/F Galois, let P be one of 
the ideals P;,..., P,, let G = Gal(K/F) be the Galois group, and let Gp 
be the decomposition group at P. Then Kp is a Galois extension of F,, the 
members of Gp extend to be isometries of Kp that fix Fy, and the resulting map 
y : Gp — Gal(K p/ Fy) exhibits Gp as isomorphic to Gal(K p/ Fy). 


PROOF. Since Kp is generated by Fy and K, it is obtained by adjoining to 
F, the same roots of the same polynomials over F that are used to generate K. 
Therefore K p/ Fy is a Galois extension. 

Lemma 6.39 gives us the map of Gp into Gal(Kp/F,). The map ¢ is a 
homomorphism because the extension of each member of Gp is unique. It is 
one-one because the inclusion K C Kp is one-one. 

To see that it is onto, let o be in Gal(Kp/F), and choose an element € € K 
such that K = F(&). If m(X) is the minimal polynomial of € over F,, then o (&) 
is an element of Kp with m(o(€)) = 0. Consequently o (&) is a root of m(X). 
Since K/F is Galois and m(X) has one root in K, all its roots are in K. Thus 
o(&) isin K. The most general member of K is of the form q(&), where q(X) 
is a polynomial of degree less than deg m(X), and g(a (&)) has to be in K also. 
Thus o is an automorphism of K fixing F. As such, o must send T into itself 
and must send P into some ideal P; of T containing pT. Meanwhile, Lemma 
6.39 shows that o is an isometry of K relative to | - |p. Thus o must send P into 
itself. In other words, the restriction of o to K is in the decomposition group G p. 


We know from Theorem 9.64 of Basic Algebra that every member o of the 
decomposition group Gp yields a member o of Gal((T/P)/(R/p)) and that 
the resulting map o +> © is a homomorphism onto. Proposition 6.40 allows 
us to reinterpret this homomorphism as carrying the Galois group of Kp onto 
the Galois group of T/P. The order of Gal(Kp/F,) is ef, and the order of 
Gal((T/P)/(R/p)) is f. Thus the kernel of this homomorphism, which is 
called the inertia group of Kp/F,, has order e. By Galois theory the fixed 
field L of the inertia group has [Kp : L] = e, L/Fy is a Galois extension, 


8. Different and Discriminant 371 


and Gal(L/F,) has order f. This construction has been arranged to make 
Gal(L/F,) = Gal((T/P)/(R/Fp)). As the Galois group of a finite extension 
of finite fields, the Galois group on the right is cyclic of order f. Therefore 
Gal(L/Fp) is cyclic of order f. 

Referring back to the statement of Proposition 6.38, we might guess that the 
fixed field L of the inertia group is the unique intermediate field such that K/L 
is totally ramified and L/F is unramified. This guess is completely correct, but 
we omit the proof. 


8. Different and Discriminant 


Theorem 6.31 is the key to a “local/global” approach to handling certain kinds 
of problems in algebraic number theory and in its analog in algebraic geometry. 
To illustrate the approach and its power, we shall give in this section and in the 
problems at the end of the chapter a full proof for the Dedekind Discriminant 
Theorem (Theorem 5.5), which was left only partially proved in Chapter V. 
That theorem as stated in Chapter V says that the prime numbers p for which 
ramification occurs in passing from Q to a number field K are exactly the primes 
dividing the field discriminant. The result we obtain now’ will in fact generalize 
Theorem 5.5 significantly. In giving the details, we leave the proofs of Proposition 
6.38 and Lemmas 6.47 and 6.48 to Problems 15-19 at the end of the chapter. 

In the approach used in Chapter V, we were unable to handle primes that are 
“common index divisors” in the sense of Section V.2. Section V.4 exhibited 
an example of a common index divisor. The difficulty with the approach in 
Chapter V is that localization by itself does not ostensibly separate the primes 
from one another sufficiently for us fully to handle them one at a time. The 
completion step is a tool powerful enough to complete the separation. 

For part of this section, we shall work in the setting of Theorem 6.31, in 
which we compare two Dedekind domains whose fields of fractions are related 
by a separable field extension. The situation of eventual interest is that the two 
Dedekind domains are the rings of algebraic integers within two number fields, 
but we shall encounter also p-adic versions of this situation. Thus let R be a 
Dedekind domain regarded as a subring of its field of fractions F’, let K be a finite 
separable extension of F with [K : F] = n, and let T be the integral closure 
of R in K. In this setting we shall introduce an ideal D(K /F) of T known as 
the “relative different” of the two fields, and we shall establish conditions under 
which the relative different captures fairly precisely what ramification occurs in 
passing from R to T. This is the generalized version of the Dedekind Discriminant 
Theorem and appears as Theorem 6.45 below. 


>2Dedekind’s Theorem on Differents, given as Theorem 6.45. 


372 VI. Reinterpretation with Adeles and Ideles 


In the special case that F = Q, we shall see that the field discriminant Dx 
satisfies |Dx| = N(D(K/Q)). In words, the field discriminant is the absolute 
norm of the relative different D(K /Q) except possibly for a sign. Using the 
properties of N(-) listed in Proposition 5.4, we can read off the version of 
the Dedekind Discriminant Theorem stated in Theorem 5.5 from the results we 
establish about the relative different. 

We work with fractional ideals in F and in K. If M is any nonzero fractional 
ideal of K, we define its (relative) dual as 


M = {x € K | Tre /r(xy) isin R forall y € M}. 


Lemma 6.41. In the above setting, if M is a nonzero fractional ideal of K, 
then so is its dual M. 


PROOF. Since T has K as its field of fractions, there exists an F vector space 
basis {t1,...,t,} of K consisting of members of T. If mo is a nonzero member 
of M and m; = tjmo, then {m,,..., m,} is an F vector space basis of K lying in 
M. Form the R submodule M, = = Rm; of M, and let {x,,..., x,} be the 
F vector space basis of K such that Trg; (xjm;) = 4;;. Let 


M, = {x € K | Trx/r(xm) isin R for all m € Mj}. 


If we expand a general element x of K as x = ae cjx;, then a necessary 
condition for x to be in M, is that cj; = Trx/;r(xm;) be in R for all j. On the 
other hand, this condition is also sufficient because an element x with all c; €¢ R 
has Trx jr (xm) = )¥7_ cjrj ifm = )%_, rjmj. Thus M, isa finitely generated 
R module with x;,..., x, as generators. Let S be the T submodule of K given by 
S= a= Tx;. This is a finitely generated T submodule of K that contains M,. 
The inclusion M > M, evidently implies that M G M,, and hence M C §. In 
this way, M is exhibited as a T submodule of the finitely generated T submodule 
S of K, and M must itself be finitely generated because T is a Noetherian ring. 


Proposition 6.42. In the above setting, the dual T of T is of the form T = 
D(K /F)~! for an ideal D(K /F) of T. This ideal D(K / F) has the property that 


M = M"'D(K/F)"! 


for every nonzero fractional ideal M of K. 


REMARK. The ideal D(K /F) in T is called the relative different of K with 
respect to F. 


8. Different and Discriminant 373 


PROOF. From the definition, T consists of all x in K for which Trg f(xt) is 
in R; any member x of T has this property, and thus T C T. Lemma 6.41 shows 
that T is a fractional ideal of K. Since T contains T, it is the inverse of an ideal 
of T. This ideal we define as D(K/F). 

Let M be an arbitrary nonzero fractional ideal of K. Since M7 'M = = T, we 
have Trxjp(M7'D(K/F)~! - M) = Trx/r(D(K/F)~ = Trx/r(1T) Cc R, 
and it follows that M~ 'DIRIFY ' Se M. For the reverse inclusion, let x be in 
M. Then Trx;p(xM -t) C Trxsp(xM) CR ¢ for allt € Tr, and hence xM C 
T= D(K/F)~ '_ This being true for all x € M, we obtain MM C D(K/F)7!. 
Therefore MC M-!D(K/F)~!. 


Proposition 6.43. In the above setting, if L is a field with F C L C K, then 
D({K/F) = D(K/L)D(L/F) 
as an equality of fractional ideals in K. 


REMARKS. Let U be the integral closure of R in L. In the displayed line of the 
proposition, D(L/F) is an ideal in U, and the right side amounts to the product 
in T given by D(K/L)-D(L/F)T. 

PROOF. We use the fact that traces can be computed in stages. An ele- 
ment x of K is in D(K/F)7~! if and only if Trxjr(XT) Cc R, if and only if 
Trzjr (Trx/c(xT)) © R, if and only if Trx;z(xT) © U = D(L/F)~", if and 
only if Trx;,(xTD(L/F)) C U, if and only if xT D(L/F) © D(K/L)~ I Thus 
D(K/F)~'!D(L/F) = D(K/L)7', and the result follows. 


The main result of this section, from which the Dedekind Discriminant The- 
orem will be derived as Corollary 6.49, is Theorem 6.45 below, Dedekind’s 
Theorem on Differents. The proof requires some preparation. Two results will be 
used to reduce Theorem 6.45 to a statement about complete fields, for which only a 
single prime ideal is involved, both for R and for T. The first of these is Theorem 
6.31, or more particularly its consequence for traces given in Corollary 6.37c. 
The other is the following strengthening of the Weak Approximation Theorem in 
the presence of additional hypotheses. The reduction step to a statement about 
complete fields then appears as Corollary 6.46. 


Theorem 6.44 (Strong Approximation Theorem). Let F be a number field, 
let R be its ring of algebraic integers, let P;,..., P, be distinct nonzero prime 
ideals in R, and let v P, for each j be the valuation of F and of its completion that 
corresponds to Pj. If/;,...,/, are integers and if x; for 1 < j <r is a member 
of the completed field F P;> then there exists y in F’ such that 


vp, (y — xj) 2h forl<j<r 


and such that vg(y) = 0 for all other nonzero prime ideals Q of R. 


374 VI. Reinterpretation with Adeles and Ideles 


REMARKS. 

(1) It will be helpful to have a name for the property in the conclusion of 
Theorem 6.44. Thus let T be a Dedekind domain regarded as a subring of its 
field of fractions K. We say that T has the strong approximation property if 
whenever distinct nonzero prime ideals P,,..., P, of T are given, along with 
integers /;,...,/, and members x; of the completed field K P, forl <j <r, 
then there exists y in K such that vp,(y — x;) = J; for 1 < j <r and such that 
vo(y) = 0 for all other nonzero prime ideals Q of T. The content of Theorem 
6.44 is that the ring of algebraic integers in any number field has the strong 
approximation property. 

(2) More generally any principal ideal domain has the strong approximation 
property. In fact, if R is a principal ideal domain with field of fractions F, if K 
is a finite extension of F, and if T is the integral closure of R in K, then K is 
a Dedekind domain (according to the remarks with Proposition 6.7), and K has 
the strong approximation property. The proof is an easy adaptation of the proof 
below, with the principal ideal domain substituting for the ring Z of integers. As 
a consequence if k is a field and if T is the integral closure of kLX] in a finite 
extension of k(X), then T has the strong approximation property. 

(3) Any Dedekind domain with only finitely many prime ideals has the strong 
approximation property as an immediate consequence of the Weak Approximation 
Theorem (Theorem 6.23). One does not need to make use of the fact that such a 
domain is always a principal ideal domain. 

(4) For a number field the conclusion of the theorem as stated imposes a 
limitation on all the nonarchimedean absolute values. The conclusion cannot be 
strengthened to impose a limitation on all equivalence classes of absolute values, 
since the Artin product formula (Theorem 6.51 below) imposes a constraint on 
the set of all of them. 


PRoor.”? We may assume that each /; satisfies /; > 0. Recall that for each 
prime number p, there are only finitely many prime ideals P in R with PN Z= 
pZ. Possibly by moving some of the conditions vg(y) > 0 into the displayed 
hypothesis concerning the P;’s, we may assume that there is some finite set 
{P1,-.-, Pq} of primes such that { P|, ..., P,} consists exactly of all prime ideals 
P such that PN Z = p;Z for some i with 1 <i <q. 

Application of the Weak Approximation Theorem (Theorem 6.23) to the ab- 
solute values corresponding to P;,..., P produces an element z € F with 


>3This proof is from Hasse’s Number Theory, pp. 379-380. The argument for R = Z and all 
1; = is the key. After an application of the Weak Approximation Theorem, what has to be shown 
is that if P; = pjZfor 1 < j <r andifa rational ab! is given, then there exists a rational mn! 
with / prime to pi, ..., pr such that the denominator of ab! — mn7! is divisible only by the primes 
P1,---, pr. Another proof of Theorem 6.44, which appears in other books, uses the theory of adeles 


and ideles to be developed in the next two sections, and again the argument for Z is the key. 


8. Different and Discriminant 375 
vj Z—x)2l forl<j<r. 


Form the fractional ideal zR in F,, and let its unique factorization be zR = 
Pe -++ P& QO, Os"; where the a; are in Z and where Q, and Q> are ideals of R 
whose prime factorizations involve no P;. Let us see that Q» divides a nonzero 
principal ideal (NV) of R whose generator N is in Z and that N can be chosen to 
be relatively prime to p;,..., pg. In fact, it is enough to treat each prime factor 
of Q» separately and multiply the results. For a prime factor P, we know that 
PNZ = pZ for some prime p in Z, and we know that pR is the product of P and 
another ideal of R. This prime p is nonassociate to each of pj, ..., Pg because 
the only prime ideals whose intersection with Z is some p;Z are P,,..., P, and 
because no such prime ideal divides Q2. Therefore the prime factorization of 
(N) contains no factor P,,..., P,. 

Let b be a positive integer to be specified, and choose an integer / such that 
IN = 1 mod Pp? for 1 <i <q. If p;R factors as [ |, Be with each P;, in 
{Pi,..., P-}, then / has the property that /N — 1 lies in (Th poy, hence in 
each Pe. Consequently /N — 1 lies in P? for 1 < j <r. 

We show that if b is sufficiently large, then the element y = /Nz is the 
element we seek. First consider nonzero prime ideals Q notin {P;,..., P,}. Our 
factorizations of zR and (N) show that yR = 1Q3Q)P,'--- P.”. The power of 
Q on the right side is > 0 because Q, and Q3 are ideals of R, and thus 


vo(y) = 0. (*) 


Now write y — xj = (UN — 1)z + (z — xj), and apply the valuation vp,. Then 
we have 
vp(y — xj) = min (vp, (UN — 1)z), vp,(z — x), 


and it follows from vp, (z — x;)) = J; that 
ve (y — xj) 2 Gj () 
if we can arrange that 
up, (UN — 1)z) = hj. (7) 


Since /N — | lies in PP and since vp,(z) = aj, a sufficient condition for (+) is that 
b+ a; = 1;. As j varies, we impose only finitely many conditions on b to get (7) 
to hold for all 7, and then the result is that («*) holds for all 7. In combination 
with (*), this inequality shows that y has the required properties. 


The preparation is all in place to prove Dedekind’s Theorem on Differents, 
from which we shall easily derive the Dedekind Discriminant Theorem. The 
statement is as follows. 


376 VI. Reinterpretation with Adeles and Ideles 


Theorem 6.45 (Dedekind’s Theorem on Differents). Let R be a Dedekind 
domain regarded as a subring of its field of fractions F’,, let K be a finite separable 
extension of F with [K : F] = n, and let T be the integral closure of R in 
K. Suppose that T has the strong approximation property. Let p > 0 be the 
characteristic of the residue class field of R/p, let p be a nonzero prime ideal 
in R, let pT = P;'--- P,? be the factorization of pT as the product of positive 
powers of distinct prime ideals in T, and let the relative different of K /F split as 


D(K/F) = ik ee Ps Q for an ideal Q relatively prime to all P;. Then for each 
j with 1 < j < g,e’ is given by 


‘ ej —1 if p does not divide e;, 
a. = 
ej with é; = ej if p divides ej. 


Consequently D(K /F) has all e = 0 if and only if e; = | for all j. 


The idea is to reduce Theorem 6.45 to the case of complete fields. In the 
notation in the statement of the theorem, the prime ideals P),..., Py are exactly 
the prime ideals of T that divide pT, and it is customary to write P; | p for these 
prime ideals of T and only these. If M is a nonzero fractional ideal of K and if 
M= Pt oo Py *Q with Q a fractional ideal whose factorization involves no P;, 
we define the p component of M to be 


k k 

My = P,'--- P,’. 
The understanding in the special case that all k; are 0 is that M, is taken to be T. In 
all cases, M is then the product over all p of its p" component, since the complete 
factorization of M has nonzero exponents for only finitely many nonzero prime 
ideals of T. For the two examples that appear in the statement of Theorem 6.45, 


Chalke ad (P= Te 
Pilp | Pilp | 
The reduction of Theorem 6.45 to the case of complete fields results from the fol- 
lowing proposition, which combines Theorem 6.31 and the strong approximation 
property (Theorem 6.44 in the case of number fields). 


Proposition 6.46. Let R be a Dedekind domain regarded as a subring of its 
field of fractions F,, let K be a finite separable extension of F with [K : F] =n, 
and let T be the integral closure of R in K. Suppose that T has the strong 
approximation property. If p is any nonzero prime ideal in R, then the different 
D(K /F) has the property that 


D(K/F) =| |] [D(Ke/Fp), 


Pp Pip 


8. Different and Discriminant 377 


the outer product being taken over all nonzero prime ideals p of R and the inner 
product being taken over all prime ideals P of T containing pT. Here the fields 
Kp and Fy, are the completions of K and F corresponding to P and p, respectively. 


PROOF. We actually will show equality of the inverses of the two sides of the 
displayed formula. By the first conclusion of Proposition 6.42, we are to show 
that a member x of K has 


TrxsrFQXT) CR if and only if Trxp/F,((XT)i) © Rp (*) 


for all p and all P with P|p. Here (-); refers to the embedding K — Kp, in 
Theorem 6.31 given by — +> & = n;(1 @ &), where nj; is the i™ projection. To 
prove (), we use the formula of Corollary 6.37c, namely 


g 
Trxr(€) = >> TrK>,/Fp (;) for all E € K. (>) 
i=l 


This formula is valid for every p. 

First suppose that TrKp/F, ((XT)i) © Rp for all p and all P with P |p. Fix p, 
and put € = xt witht ¢ T. Summing the traces over P with P |p and applying 
(«*), we see that the valuation with respect to p of the member Trx/-(&) of F 
is > 0. That is, the factor p* that appears in the factorization of the principal 
fractional ideal Trx;(€)R of F has k > 0. This being true for all p means that 
Trx/r(&)R is an ordinary ideal. Hence Trx/(&) is in R. 

In the reverse direction, suppose that Trx (xT) © R. For each nonzero prime 
ideal P in T, let vp be the corresponding valuation. Fix p. Let {P),..., P,} be the 
set of P’s with P |p. Now fix i. By the assumed strong approximation property 
of K, there exists an element y in K with 


Up,(y — x) 2 max(vp,(x), 0), 
vp (y) = max(vp,(x),0) for j Fi, 
vo(y) = 0 for all prime ideals O ¢ {P},..., Pg}. 


Let us see that vp, (yx7!) > 0 for all j. For j 4 i, this is immediate because 
up,(y) = up,(x). For j =i, we compute that 


vp,(yx | — 1) = vp, (y — x) — vp,(x) = max(vp,(x), 0) — vp, (x) 
= max(0, —vp,(2)) = 0, 


and then we see that vp, Gx) > min(vp, (ye — 1), vp (1)) = 0. 


378 VI. Reinterpretation with Adeles and Ideles 


With y now fixed, we make use of the strong approximation property of K a 
second time, obtaining an element z in K with 


vp(z— yx!) > max(vp(x~!),0) forl <j <g, 
vo(z) = 0 for all prime ideals O ¢ {Pi,..., Po}. 


Since vp (yx!) > O and vp, (z — yx!) > 0 for all j, we find that up(z) 20 
for all j. From vg(z) => 0 for all other Q, we conclude that z is in JT. Since 
Trx/r(xT) C R, Trx;r(xz) lies in R. The trace formula (*«) therefore shows 
that 


8 
do TrKp,/Fy(%jZj) lies in Rp. (+) 
j=l 

Meanwhile, we have 

TrKp,/Fp (xjZ;) = TK», /Fp (x; (z; = yjx;')) + Tk», /Fp (ys) (+t) 


for 1 < j < g. Forall j, the first term on the right side of (+7) lies in Rp because 
the definition of z makes vp, (x(z — yx~!)) > 0. For j 4 i, the second term 
on the right side lies in Ry because of the definition of y. Thus ({+) shows that 
ThKp,/Fy (x;z;) lies in Ry for j 4 i. Comparing this conclusion with (7), we see 
that Trx p,/Fp (x;z;) lies in Ry. Resubstituting into (7+), we find that 


TrKp, /Fp (Vi) lies in Ry. (4) 


Finally the definition of y shows that vp, (y — x) > 0. Hence TrKp, [Fy (Yi — Xi) 
is in Rp. Combining this fact with (+), we conclude that TrKp, /Fp (Xi) is in Ry. 
Since i is arbitrary, IT Kp, /Fp (xj) isin Rp forl < j < g. 


With the proof of Theorem 6.45 reduced to the case of complete valued fields 
by Proposition 6.46, we need to make use of Lemmas 6.47 and 6.48 below, whose 
proofs are carried out in Problems 17-19 at the end of the chapter. 


Lemma 6.47. Let F be a complete valued field with respect to a discrete 
nonarchimedean valuation, let R be its valuation ring, let p be its valuation ideal, 
let K be a finite separable extension of F with [K : F] =n, let T be the integral 
closure of R in K, and let P be the unique nonzero prime ideal in T. Suppose 
that K/F is totally ramified with pT = P° for an integer e > 1, and suppose that 
the isomorphic residue class fields R/p and T/P are finite fields of characteristic 
p. Then the different D(K /F) is given by D(K/F) = P®, where 


; e-l if p does not divide e, 
ec = 


é with e > e if p divides e. 


8. Different and Discriminant 379 


Lemma 6.48. Let F be a complete valued field with respect to a discrete 
nonarchimedean valuation, let R be its valuation ring, let p be its valuation ideal, 
let K be a finite separable extension of F with [K : F] =n, let T be the integral 
closure of R in K, and let P be the unique nonzero prime ideal in 7. Suppose that 
K /F is unramified, i.e., has pT = P, and suppose that the residue class fields 
R/p and T/P are finite fields of characteristic p. Then the different D(K/F) 
equals T. 


PROOF OF THEOREM 6.45. Proposition 6.46 shows that 


D(K/F)p =| | D(Kp/Fy). («) 
P\p 


Thus consider an extension K p/ Fy of complete valued fields. Let L be the inertia 
subfield of K p/ Fy, as given by Proposition 6.38. The intermediate field L has the 
properties that K p/L is totally ramified and that L/F, is unramified. 

Let U be the integral closure of R in L, and let % be the unique nonzero 
prime ideal in U. The properties of L make oT = P® for a suitable integer 
e = e(P |), T/P = U/g, and pU = g. Lemmas 6.47 and 6.48 tell us that 
D(L/Fp») = U and that D(K p/L) = P®, where 


ée = 


- (#*) 


e-—l if p does not divide e, 

é with e > e if p divides e. 
Problem 33 at the end of Chapter IX of Basic Algebra shows that ramification 
indices multiply for successive extensions. Thus e(P |p) = e(P | g)e(go|p) = 
e-1 =e. Proposition 6.43 shows that differents multiply in corresponding fashion. 
Therefore D(K p/Fy) = D(Kp/L)D(L/Fy) = P’U = P®. Substituting into 
(«), we obtain 


D(K/F)p =  D(Kp/ Fp) = QB PoP, 


Pip Pip 


where e’(P | p) is the integer e’ of («*) when e = e(P |p). This proves Theorem 
6.45 for the p"" component of D(K/F). Since p is arbitrary and only finitely 
many components can be unequal to 7, the theorem follows. 


Corollary 6.49 (=THEOREM 5.5, Dedekind Discriminant Theorem). Let K 
be a number field, let T be its ring of algebraic integers, let p be a prime number, 
and let (p)T = PY oon Pe be the factorization of (p)T as the product of powers 
of distinct prime ideals in T. Then e; is greater than 1 for some j if and only if 
p divides the field discriminant Dx. 


380 VI. Reinterpretation with Adeles and Ideles 


PROOF. Let us observe first that the discriminant Dx is given up to sign by the 
index [T/ T|. Infact, T is atorsion-free finitely generated abelian group and hence 
is free abelian of rankn = [K : Q], say with an ordered Z basis’ = (x1, ..., Xn). 
Since the Q bilinear form (x, y) +> Trx/g(xy) is nondegenerate on K, there exists 
an ordered basis A = (y1,..., Yn) of K with Trx/Q(xj yj) = 4;;. Let us write 
xj = >); ay; with all a;; in Q. According to Proposition 5.1, Dx equals the 
discriminant D(I’) of I’, defined in Section V.2 by D(I’) = det[Trx /Q(xjx;)]jj. 
Substituting xj = )°; aijyi, we obtain 


Dr = det [ > akj Trex oivn)];; = det [ an Sik |; = det[a;;]ij. 
k . k 


Thus |Dx| = IT/T| = |D(K/Q71/T , as asserted. 
In a moment we shall show that 
\D(K/Q'/T| = |T/D(K/Q|, (x) 


from which we conclude that |Dx| = N(D(K/Q)). Assuming («), we continue. 


Unique factorization of ideals allows us to write DK /Q) = Py Maes Py *Q, where 
Q is an ideal relatively prime to (p). Combining the equality Dx = N(D(K/Q)) 
with Proposition 5.4 shows that 


§ e.. 8g y 
Dx = N(O(K/Q) = HQ) TN; = NTI pe, 
{LF I= 


where N(Q) is an integer not divisible by p and where f; = dimp,(7/P;) for 
1 < j < g. Consequently Dx is prime to p if and only if e = 0 for all j. If we 
take into account that T has the strong approximation property as a consequence 
of Theorem 6.44, then application of Theorem 6.45 completes the proof of the 
present corollary except for the verification of («). 

Thus we are left with proving that ID(K/Q7'/T| = |T/D(K/Q)|. More 
generally we shall show that 


[1/7 = IT /1| (x) 
for every nonzero ideal J in T. In turn, we shall deduce (*«) after showing that 
|M/PM|=N(P) (7) 
whenever M is a nonzero fractional ideal in K and P is a nonzero prime ideal 


in T. We do so by showing that M/ PM is a vector space over the field T/P of 
dimension 1. It is evident that T carries M to itself and PM to itself, and that 


8. Different and Discriminant 381 


P carries M to PM. Thus the action of T on M/PM descends to an action of 
T/P on M/PM. The vector space M/P M is not 0 because M ~ PM by unique 
factorization of fractional ideals. To see that M/PM has dimension at most 1, 
fix an element x of M that does not lie in PM. Then xT + PM is a fractional 
ideal of K that is contained in M + PM = M and contains PM and a member 
of M that is not in PM. Hence it equals M. Accordingly, if y € M is given, we 
can choose t € T such that xt— y isin PM. Then (t+ P)(x+PM) = y+ PM, 
and T/P carries x + PM onto M/PM. So M/PM is |-dimensional over T/P, 
and (+) follows. 

Returning to (**), let 1 = Q, --- Q; express J as the product of nonzero prime 
ideals. Iterated application of (+) and the First Isomorphism Theorem gives 


[01 /T| = |I7'/Q1 ++ Oct! | = I/O =» Ox-1I' |N (Ox) 
= |17'/Q1-*» Qe-21"|N(Qi)N(Qx-1) 


k 
+++ = [I7'/I“'| TT N(Q;) = NU). 
j=! 


This proves (+) and therefore also (+). 


One more point needs explanation. The discussion in Section IX.17 of Basic 
Algebra concerned a monic irreducible polynomial F(X) in ZLX] and its reduc- 
tion F(X) modulo p, and the interest was in the Galois group G of the splitting 
field K’ of F(X) over Q. Theorem 9.64 of that book dealt with the natural 
homomorphism from a decomposition subgroup Gp of G onto the Galois group 
G of the splitting field over F,, of F(X), and it was asserted without proof that 
this homomorphism is one-one if p does not divide the discriminant of F(X). 
The order of the kernel of the homomorphism was identified as the common 
ramification index of the prime ideals P’ containing (p)R’, R’ being the ring 
of algebraic integers in K’. Let K = Q[X]/(F(X)). Except in the quadratic 
case, the field K typically has much lower dimension over Q than K’ does. The 
Dedekind Discriminant Theorem relates Dx to ramification relative to K, as well 
as Dx’ to ramification relative to K’. We know that primes not dividing the 
discriminant of F(X) do not divide Dx, but we need a proof that primes not 
dividing the discriminant of F(X) do not divide Dx. 

To approach this question, one needs the notion of “relative discriminant” anal- 
ogous to that of “relative different” for an extension K/F of number fields. The 
relative different is defined so as to be an ideal for K, and the relative discriminant 
is an ideal for F. (The field discriminant is the generator of the relative discrimi- 
nant for K/Q with the appropriate sign attached.) One proves that the behavior 
of the relative discriminant under successive extension is reasonable, just as it is 
for degree of extension, ramification indices, residue class degrees, and relative 


382 VI. Reinterpretation with Adeles and Ideles 


differents. These results show that if Q C K C L, then the field discriminant for 
K divides the field discriminant for L. The next step is to extend the notion of 
field discriminant so that it applies to commutative semisimple algebras and to 
show that the discriminant of a tensor product over Q of finitely many number 
fields is a certain function of the field discriminants and dimensions of the factors. 
Finally we return to F(X) and its splitting field K’. Let € be a root of F(X) in 
K’, and let 01 (&), ..., on (€) be the distinct conjugates of €. Then K’ is generated 
by the subfields Q(&;), ..., Q(&,), and the (Q multilinear) multiplication map 
extends to an algebra homomorphism of Q(&;) @g --- ®g Q(E,) onto K’. As 
the tensor product of commutative semisimple algebras in characteristic 0, this is 
commutative semisimple (Corollary 2.37) and is therefore a direct sum of fields 
(Theorem 2.2). Thus we can regard K’ as a subfield of the tensor product of fields 
isomorphic to Q[X]/(F (X)), and the discriminant of K’ divides the discriminant 
of the tensor product. Putting everything together, we see that the only possible 
primes dividing Dx: are the primes that divide Dx. Therefore the primes that fail 
to divide the discriminant of F(X) do not ramify in K’. 


9. Global and Local Fields 


A global field K is either a number field, i.e., a finite extension of Q, or a function 
field in one variable over a finite field, i.e., a finite extension of some F, (X), where 
I, is a finite field.2* An example of the latter is 


K =F, ()[yl/0” — @? — x) = Fp(@)[vx3 — x]. 


In this section we shall develop some machinery for working with global fields. 
Our interest at present is in number fields, but function fields in one variable are 
the object of study in Chapter IX. Consequently the results will be stated for 
all global fields as long as all global fields can readily be treated together, and 
thereafter we shall specialize to number fields. 

The virtue of global fields for current purposes is that their completions with 
respect to nontrivial absolute values are always locally compact with a nontrivial 
topology. In the case of number fields, we know this for archimedean absolute 
values by Proposition 6.27, and it follows for nonarchimedean absolute values 
by Corollary 6.21 and Theorem 6.26. In the function-field case as above, the 
completions have to be nonarchimedean by Proposition 6.14, and their absolute 
values have to be discrete by Corollary 6.22; then the residue class fields are always 


>4Tt will be shown in Chapter VII that a function field in one variable over a finite field is always 
a finite separable extension of F, (Y) for a suitable indeterminate Y. 


9. Global and Local Fields 383 


finite, and Theorem 6.26 shows that the completions are all locally compact with 
a nontrivial topology. 

To study a global field K in the style of this chapter, one studies simultaneously 
the completions” of K with respect to one absolute value from each equivalence 
class.” Two completions are said to be equivalent completions if the absolute 
values on the domains of the completion maps are equivalent in the sense of Sec- 
tion 3. An equivalence class of completions of nontrivial absolute values is called 
a place of K. A place is called archimedean or nonarchimedean according as 
the corresponding absolute values are archimedean or nonarchimedean; in the 
archimedean case it is called real or complex according as the locally compact 
completed field is R or C. 

Because of the special hypotheses for the situation with global fields, we shall 
see that to each place corresponds a distinguished choice of an absolute value 
on K from the equivalence class, called the normalized absolute value in the 
class.2’7. These normalized completions are glued together’® in a fashion to be 
described in the next section to form the ring of “adeles” of K and the group of 
“ideles” of K. Historically ideles preceded adeles, and ideles were introduced in 
order to reinterpret class field theory and improve upon it; convincing motivation 
is therefore not readily at hand without knowledge that extends beyond this book. 
However, we can get some advance insight into how adeles and ideles might be 
useful from the first part of the classical proof of the Dirichlet Unit Theorem 
(Theorem 5.13) as given in Section V.5. 

That proof in effect handles archimedean places in a way similar to the way 
that adeles handle all places. In more detail let K be a number field of degree 
n over Q, and let R be its ring of algebraic integers. In Chapter V we usually 
regarded K as a subfield of C, but we shall not do so here. As was observed 
in Section V.2, there exist exactly n field mappings of K into C, and we denote 
them by o1,..., 0. If x is in K, then the images o)(x),..., On(x) are called 
the conjugates of x. Among o1,...,0, are r; real-valued mappings and r2 
complex conjugate pairs, with r; + 2r2 = n. Let us number the mappings so that 
O1,...,0,, are real-valued and so that o,,41,...,0;,4,, pick out one from each 
complex conjugate pair. Proposition 6.27 shows that the functions x +> |o;(x)|, 


?>It is important not to lose sight of the fact that a “completion” is a certain kind of homomorphism 
of valued fields and does not consist merely of the range space. 

©The completion of the trivial absolute value is excluded. 

27The range of each completion is a locally compact field whose topology is not the discrete 
topology. Such a field is often called a local field in books. Examples are R, C, p-adic fields, and 
fields F, ((X)) of formal Laurent series. One can show that there are no other locally compact fields 
whose topology is not discrete. The definition of “local field” in some books is arranged to exclude 
Rand C. 

>8It is tempting to think in terms of the gluing as involving just the locally compact fields, but 
the completion mappings play a role and that description is thus an oversimplification. 


384 VI. Reinterpretation with Adeles and Ideles 


.,X F* |0;,4,,(%)| are a complete set of representatives for the archimedean 
places of K; the first r; are real, and the last rp are complex. 

Just before Lemma 5.17 we introduced the mapping ® : K — R" x C” given 
by 


DIX) (G1) hon On Cer Crone) forx € K. 


Lemma 5.17 observed that the image ®(R) of R is a lattice in R" x C? = R’. 
The starting point for proving the Dirichlet Unit Theorem in Section V.5 was to 
apply the Minkowski Lattice-Point Theorem to this lattice ®(R). Proposition 
6.27 allows us to interpret the mapping ® as the natural embedding of K into the 
product of its completions at all archimedean places. 

The ring of adeles of K will be a corresponding space for dealing with com- 
pletions with respect to all nontrivial absolute values, archimedean and nonar- 
chimedean. 

While we have the archimedean places of the number field K at hand, let us 
address the question of their normalized representatives. Since the field maps 
from K into C given by 0,;,41,...r,4r, are equal to the complex conjugates of 


Pieiisg 


Or, trotl) +++» On, every member x of K has 
n ry ry+r2 2 
Neo) = []oj) = (Tl )(_ TT 1e@)/’). 
j=! j=! j=ritl 


This formula can be viewed as an archimedean analog of the formula in Corollary 
6.37b. The number field Q has one archimedean place, and ordinary absolute 
value is taken as its normalized representative. We denote this representative by 


| - 49. With | - | denoting ordinary absolute value on R and C, we obtain 
TI ry+r2 2 
INK/QO)loo = (TT leI)( TT taj’). 
j=l J=ni+l 


It is customary to use letters like v and w as indices for places. The real places 
are the completions x +> o;(x), 1 < j <r, of K into R, and the normalized 
absolute value on K for a real place is the pullback from ordinary absolute 
value on R. Thus if | - |, denotes ordinary absolute value on R and if v is a 
real place corresponding to o;, then we define |x|,, = |o;()|p for x € K. The 
normalization to use for the complex places is motivated by the formula above. 
Ifr; +1 < j <r, +1ro, then o; in effect contributes twice to the above formula, 
once from j and once from j +72, and the notion of normalized absolute value is 
to take this double contribution into account. Thus we write | - |< for the square 
of the ordinary absolute value on C; this quantity is not really an absolute value, 
since the triangle inequality fails for it, but it has too many desirable features to 


9. Global and Local Fields 385 


be ignored. We define the normalized absolute value on K for a complex place 
to be the pullback from this function | - |, on C even though the result fails to 
satisfy the triangle inequality. Thus if v is a complex place corresponding to o; 
withr; +1 < j <r) +1ro, then we define |x|,, = |oj(x)|¢ = |o;(x)|? forx € K. 
With these definitions of normalized absolute values for archimedean places, the 
formula above for |Nr/Q(x)|,. can be rewritten as 


r ry+tr2 
IN«QMlo = (Tle) TL lee) = (TT eh)( TL bh): 
j=l j=ritl v real v complex 


We summarize matters in the following proposition. 
Proposition 6.50. If K is a number field, then 


INrFo@l.o= I] Isl, forx € K, 


v archimedean 


where | - |,, is the pullback of | - |, , the ordinary absolute value, for real places 
and where | - |,, is the pullback of | - |, , the ordinary absolute value squared, for 
complex places. 


At this point we could give a definition of normalized absolute value corre- 
sponding to nonarchimedean places. But we shall digress in order to motivate 
the definition using concepts from measure theory that may be known to some 
readers and not to others. These concepts play a role within the text only in the 
next paragraph and in Example 4 of normalized discrete absolute values below, 
and the reader will not miss any results or proofs by skipping this material. 

The digression begins. Any locally compact group has a nonzero measure 
on it that is invariant under left translation,?? and this measure is unique up to 
multiplication by a scalar. Let a locally compact field L be given, and let jz be 
an invariant measure of this kind with respect to the additive group of L. Each 
nonzero element c of L has the property that (cE) is a multiple of w(E) that 
is independent of FE. If we write |c|, for this multiple and put |O|, = 0, then it 
turns out that some power | - |f with 0 < @ < | is necessarily an absolute value 
and that this power @ can be taken to be | in all cases except when L = C. In the 
case of C, it is easy to check that |c|,. = |c|*, and the triangle inequality therefore 


2° Although the details will not be important for us, let us be more precise: The measure is on 
the o-algebra of “Baire sets” on the group—the smallest o-algebra containing those compact sets 
that are intersections of countably many open sets. The measure is not the 0 measure, it is finite on 
all the generating compact sets, and it takes the same value on a set as it does on any left translate 
of the set. It is called a left Haar measure. For more information, see the author’s Advanced Real 
Analysis, Chapter VI. 


386 VI. Reinterpretation with Adeles and Ideles 


fails for ~ = 1. But in all other cases, | - |, is a canonical choice for an absolute 
value on L. Now suppose that w : K — L isa field map of a global field K onto 
a dense subfield of a locally compact field. We impose this special absolute value 
| - |, on L. Then a necessary and sufficient condition on an absolute value | - |, 
for w: (K,|- |x) > (L,| - |,) to be a completion is that | - |, = w*(| - |,). 
In other words, the pullback of the special normalization of the absolute value on 
the locally compact field is the natural normalization to use for the absolute value 
on the global field. 

With the digression now over, we want to associate to each nonarchimedean 
place of a global field a special normalization of an absolute value. (We handled 
the question of normalization at archimedean places earlier in the section.) We can 
be a bit more general. Suppose that F is an arbitrary field with a discrete valuation 
v and with corresponding nontrivial absolute value given by |x|, = r~’°? for 
somer > 0. Let R be the valuation ring and p the valuation ideal; p is a principal 
ideal of the form (zr) for some z € R. Suppose that the residue class field R/p is 
finite. Then we say that | - |, is normalized if |z|,, = |R /p|~!. This definition 
is independent of the choice of z. 


EXAMPLES OF NORMALIZED DISCRETE ABSOLUTE VALUES. 


(1) The field Q and the p-adic absolute value given by |ab™! p*| >= p—* when 
a and b are integers prime to p. The valuation ring R consists of all ab~! with 
a € Z,b € Z, and b prime to p. The valuation ideal consists of all such ab"! 
with a divisible by p, and the quotient R/p is isomorphic to F,. The element 
mz may be taken to be p, and IP\, equals p~', which equals |R/p|~!. Thus the 
p-adic absolute value on Q is normalized. 


(2) Let K be anumber field of degree n over Q, and let T be its ring of algebraic 
integers. Let p be a nonzero prime ideal in T, and let v be the corresponding 
valuation of K. Let g = |T/p|, and define xl, = gq ’™. Then | - lp is 
normalized because Theorem 6.5e shows that the residue class field obtained 
from the valuation is isomorphic to 7'/p. 

(3) Let K = F,(X), fix a prime polynomial c(X) in F,[X], and consider 
the absolute value on K defined by |a(X)b(X)7!c(X)*| = q7*#8°O whenever 
a(X) and b(X) are polynomials relatively prime to c(X). This example runs 
completely parallel to the two previous examples, and mz may be taken to be 
c(X). The residue class field has as representatives all polynomials h(X) with 
deg h(X) < degc(X) and thus has order g*°8°™, This order matches |c(X)|~!, 
and hence | - | is normalized. 


(4) If F is a locally compact field whose topology comes from some nontrivial 
discrete absolute value with finite residue class field, then the canonical absolute 
value | - |,, described in the digression above and obtained from an invariant 


9. Global and Local Fields 387 


measure jz on the additive group of F is normalized. To see this, let R and 
p be the valuation ring and valuation ideal, and write p = (7). Putm = 
|R/p|, and let x;,..., xX, be representatives of the m cosets of R/p in R. Then 
(xj +p) = w(p) for 1 < j < m by translation invariance of jz, and hence 
w(R) = Ye (x; +p) = mu(p). Substituting and using the definition of | - |, 
gives (p) = ww R) = || ,-u(R) = ||,-mu(p). The number j1(p) is positive, 
since p is a nonempty open subset of F, and we can cancel to get |z|,;m = 1. 
Thus ||, = |R/p|~!, and | - | ~ is normalized. 


Theorem 6.51 (Artin product formula). If F isa number field and if normalized 
absolute values are used, then 


I] |x|, =1 for all nonzero x € F, 
Uv 


the product being taken over all places v. In this product, only finitely many of 
the factors can be different from 1. 


REMARKS. A version of this theorem is valid for function fields in one variable. 
As Corollary 6.22 permits, one can state this analogous theorem in terms of 
discrete valuations that are trivial on the base field, and absolute values need play 
no role. The precise statement and proof appear in Chapter IX. Corollary 6.9 in 
the present chapter is a special case. 


PROOF. First we prove the result for Q. Let a rational y = + pr .-» p® be 
given; here pj,..., p, are distinct primes. The product [, |y|, is taken over 
all places, hence over all primes and the one archimedean place oo. For this 


y € Q we have |y|p, = pe for 1 < j < r and |y|, = 1 for all other 


ky 


1... p-*, Since |y|,, = pi! +++ pk”, we obtain 


primes p’. So], prime IYIp = pe vee 
Taig lvl, =1. 

Let R be the ring of algebraic integers in F. Given x in F, factor the fractional 
ideal xR. The nonarchimedean places correspond to the nonzero prime ideals 
in R, and |x|, is 1 except for the v’s corresponding to those prime ideals in the 
factorization. There are only finitely many of these. Since also there are only 
finitely many archimedean places, we see that |x|,, = 1 for all but finitely many v. 

Let us consider the nonarchimedean places separately from the archimedean 
ones. The nonarchimedean places correspond to nonzero prime ideals so, and we 
group these according to the prime number p such that 1 Z = pZ, writing 
§2 | pZ for this correspondence. For fixed p and for each go with £9 | pZ, let 
X be the image of x under the local embedding in F,. Corollary 6.37b gives 
Nrjg(*) = Heipz NF./Qp (Xp). Theorem 6.33 shows that tpl pr, is a power 


of |Nr,/Q, Xo) le,: To determine the power, we observe from Example 2 that 


388 VI. Reinterpretation with Adeles and Ideles 


the canonical absolute values on Q, and F,, are normalized, and we specialize 
[X| Fy and |N Fo/Qp ple, to Xp, in Qp. Making the comparison, we find that 
INF, /Q, Xp) le, = |Xo| F, We know that each local embedding respects absolute 
values; since Theorems 6.5e and 6.26e together show that the residue class fields 
of F, and Q, have orders |R/g| and |Z/pZ|, it follows that Xl pe, = |X\p- 
Therefore 


INF/Q@)|p = INF/eMlg, = IL INz,/0, ple, 


9|pZ 
= TI lols, = IT lp. (*) 
9|pZ 9|pZ 


For the finitely many archimedean places, Proposition 6.50 gives us the formula 


INFeMlo= I] I|rh, (#*) 

v archimedean 
where | - |,, is the ordinary absolute value on Q. Multiplying (*) and (**) and 
using the known identity [],, |y|, = 1 for the element y = Nr/g(x) of Q, we 
obtain the theorem. 


10. Adeles and Ideles 


In this section we do the gluing that creates the adeles and the ideles out of the 
places of a global field. We begin with a topological construction, and then we 
superimpose the algebraic structure. The general constructions and the two main 
theorems will be valid for all global fields, but we shall discuss proofs of the 
theorems only for number fields. 

Suppose that {X; | i € J} is anonempty family of locally compact Hausdorff 
spaces. Assume that for all but finitely many i € J we are given a compact open 
subset Z; of X;. The restricted direct product of the X;’s relative to the Z;’s is 


the subset ; 
I] Xi CS I] Xj 
iel iel 
defined by 
(Xiier E I] 'x i if and only if x; € Z; for all but finitely many 7. 
iel 
The restricted direct product is topologized as follows. Suppose that S C J is a 
finite subset and that Z; is defined fori ¢ S. Put 


X(S) =|[x x | [2. 


icS i¢S 


10. Adeles and Ideles 389 


In their respective product topologies the first factor is locally compact, and the 
second factor is compact. Certainly X(S) is a subset of the restricted direct 
product, and evidently the restricted direct product is the union of the subsets 
X (S) over all finite subsets S for which Z; is defined wheni ¢ S. We topologize 
Tlie, Xi by insisting that each X (S) be an open subset.*? The resulting topology 
is locally compact Hausdorff. In fact, any two members of [];-, X; lie in a 
common X(S), and the open sets that separate them in X(S) separate them in 
ise <, Xi. Also, any (%;)je7 is in some X(S), which is locally compact, and a 
compact neighborhood within X (S) will be a compact neighborhood in []}., Xi. 

Now we superimpose the algebraic structure. Let K be a global field. To each 
place v of K, we have associated a normalized absolute value | - |, on K anda 
completion t, : (K,| + |,,) > (Ky, | - | K,)* Each of the complete valued fields 
K, is locally compact. Except at the finitely many archimedean places, which 
occur only in the number-field case, | - | K, arises from a discrete valuation. We 
take R, to be the corresponding valuation ring, 1.e., R, = {x € K, | Ix|, < ay 
This is a compact open additive subgroup of K,. Thus we can form a restricted 
direct product in which the index set J is the set of places of K, the v™ locally 
compact Hausdorff space is Ky, and the v compact open subset is R,. This 
restricted direct product carries the structure of a commutative ring with identity, 
with its addition and multiplication defined in coordinate-by-coordinate fashion, 
and the operations are continuous. Thus we obtain a topological ring, known as 
the ring of adeles of K and denoted by Ax or simply by A when no ambiguity is 
possible. 

If for each x € K,,, we send x into the tuple (a,), that has a,, = x anda, = 0 
for v # vo, then the result is a one-one continuous ring homomorphism of K, 
into A. This homomorphism of course does not send the multiplicative identity 
of K, to the multiplicative identity of A. 

The completion mappings 1, : K — K, embed K into each K,, and we can 
form a corresponding diagonal map: : K — [],, Ky into the full product of K,,’s 
by defining «(x) = (1,(x))y. Actually, we shall check for x 4 0 that only finitely 
many places have |t,(x)|,, = |x|, unequal to 1, and therefore the image of the 
diagonal map is in the adeles. Thus we have a diagonal ring homomorphism 


i:K-rA given by U(x) = (ty(X))y forx € K. 


The fact that in the number-field case, |.x|,, is unequal to | for only finitely many 
places appears as part of Theorem 6.51. For the function-field case, the field K is 
a finite separable extension of some field F(X), and all but finitely many places 
come from nonzero prime ideals in the integral closure R of F,[X] in K. At the 


30Tn other words, a set in Te, Xi is open if and only if its intersection with each X(S) is open 
in X(S). 


390 VI. Reinterpretation with Adeles and Ideles 


unexceptional such places the value of |x|,, comes by treating x R as a fractional 
ideal and factoring it; only finitely many ideals are involved in the factorization, 
and only those among all the unexceptional places can have |x|,, # 1. The main 
structural theorem about the adeles is as follows. 


Theorem 6.52. If K is a global field, then the image of K in the adeles A 
under the diagonal mapping : : K — A is discrete, and the quotient A/i(K) of 
additive groups is compact. 


For a number field the compactness in Theorem 6.52 encodes Lemma 5.17 
and the Strong Approximation Theorem. The proof of the theorem is not hard, 
and we return to it in a moment. In the current discussion Theorem 6.52 is 
not something to appreciate for its own consequences but instead is a prototype 
for a corresponding theorem about “ideles” that encodes for number fields the 
finiteness of the class number and the Dirichlet Unit Theorem. 

The construction of the “ideles” of K proceeds similarly to the construction 
of the adeles. Again we use a restricted direct product, with the set of places as 
index set. The locally compact Hausdorff space associated to the place v is the 
multiplicative group K;*. For v nonarchimedean, we again let R, be the valuation 
ring in K,, and take the compact open subset of K,* to be the group R* of units 
in Ry, Le., RX = {x € K, | |x|, = ae The group of ideles is the restricted direct 
product of the groups Kx relative to the compact subgroups R**. The result is a 
locally compact abelian group, known as the group of ideles of K and denoted 
by Ix or simply by I. 

Warning: As a set, I coincides with the group of units A*. However, the 
topologies do not match. The topology for I is finer than the relative topology on 
A. See Problems 7-8 at the end of the chapter. 

Iffor each x € K,,, we send x into the tuple (a,), that has a,, = x anda, = 1 
for v € vo, then the result is a one-one continuous group homomorphism of K* 
into I. As with the ideles we also have a diagonal mapping i : K * — I given by 
L(x) = (4y(x))y; the image is contained in J, since for a nonzero x € K, |x|, can 
be unequal to 1 for only finitely many v. 

The Artin product formula (Theorem 6.51) and the corresponding result for 
function fields in one variable over a finite field put a constraint on the image. We 
define the absolute value |(a,),| of an idele (a,), to be the product of the absolute 
values of the components: |(ay)»| = [], |dvlv. This is well defined because only 
finitely many factors are allowed to be different from 1. If I! denotes the group 
of ideles of absolute value 1, then I! is a closed subgroup of I. The Artin product 
formula and its function-field analog imply that the image of the diagonal mapping 
is contained in I'. The main structural theorem about the ideles is as follows. 


10. Adeles and Ideles 391 


Theorem 6.53. If K is a global field, then the image of K* in the subgroup 
I! of the ideles I under the diagonal mapping . : K* — I is discrete, and the 
quotient group I! /1(K™*) is compact. 


From now on, we suppose that the global field K is a number field. Let 
Soo be the set of archimedean places. We begin by supplying direct proofs 
of the discreteness in Theorems 6.52 and 6.53 and of the compactness of the 
quotient in Theorem 6.52. After some additional discussion we return to prove 
the compactness of the quotient in Theorem 6.53. 


PROOF OF DISCRETENESS OF 1(K) IN THEOREM 6.52. It is enough to produce 
a neighborhood U of 0 in A such that U M1(K) = {0}. The set U of all 
(xy)y € A such that |x,|, < 1 for all archimedean places and |x,|,, < 1 for all 
nonarchimedean places is an open product set in A(S,.) and hence is an open 
neighborhood of 0 in A. Since Theorem 6.51 shows that |], |4o()|v = 1 for all 
y £0in K and since [ J, |xvly < 1 for all (x,), in U, U Nu(K) = {0}. 


PROOF OF DISCRETENESS OF t(K*) IN THEOREM 6.53. The set U of all 
(xy)y € [such that |x,—1],, < 1 forall archimedean places and |x,—1],, < 1 forall 
nonarchimedean places is an open product set in I(S..) and hence is an open neigh- 
borhood of 1 in I. If yp)» = t(y) with y € K* andy ¥ 1, thenx,—1 =1t,(y—1) 
with y—1 4 0, and Theorem 6.51 shows that] |, |o(y)—11, =[], lw Q—-DI, =1. 
The members (x,), of U all have [[,, |x, — 1], < 1, and thus U Ni(K*) = {1}. 


PROOF OF COMPACTNESS OF A\/t(K ) IN THEOREM 6.52. We begin by observing 
that 


i.e., that the set of sums of a member of 1(K) and a member of A(S,,) exhausts A. 
In fact, given (x,)y in A, we let vj, ..., v, be the finitely many nonarchimedean 
places for which |x», lige The Strong Approximation Theorem (Theorem 
6.44) applied to the elements x,,,...,x,, produces a member y of K such that 
tu, (y) — Xvjly; < 1 for 1 < j <r and such that |l,(y)|, < 1 for all other 
nonarchimedean places v. Consequently |l,(y)—xy|, < 1 forall nonarchimedean 
v. This inequality means exactly that (x,)y — ¢(y) is in A(S.). Hence 


x =U(y) + (Hv)v — &)) 


is the required decomposition, and () is proved. 
In addition, we have 
t(R) = ((K) N A(Soo). (2) 


392 VI. Reinterpretation with Adeles and Ideles 


In fact, the inclusion C is clear. For the inclusion 2, let y be amember of K such 
that ¢(y) is in A(S,.). Then [cy (y)|, < 1 for all nonarchimedean v, and it follows 
that y is in R. 

To prove the compactness, we use the identity (M+N)/M = N/(MNN) given 
by the Second Isomorphism Theorem in the category of locally compact abelian 
groups, taking M = 1(K) and N = A(S.). Then (*) shows that M+ N = A, 
and (**) shows that MM N =1(R). Hence 


A/t(K) = A(Soo)/t(R). (1) 


Let us write A(Soo) = Q x A, where Q = R" x C? = J], archimedean Kv and 
A = []) nonarchimedean Rv. The mapping ® : K — defined near the beginning 
of Section 9 has the property that 


i(R) + ({O} x A) = ®(R) x A. 
From this equality we obtain 
A(Soo)/(U(R) + ({0} x A) = (2 x A)/(P(R) x A) = 2/P(R), 


and Lemma 5.17 shows that this is compact. Since ({0} x A) M7(R) = {0}, 
application of the First Isomorphism Theorem and then the Second Isomorphism 
Theorem gives 


(A(So0)/t(R)) / (A(Soo)/((R) + ({0} x AY) = (LCR) + {0} x A)) /e(R) 
= ({0} x A)/(({O} x A) Ne(R)) 
= {0} x A, 
and this is compact also. So the closed subgroup A(S..)/(t(R) + ({0} x A) of 
A(Soo)/t(R) and the quotient by this subgroup are both exhibited as compact, and 


it follows that A(S..)/t(R) is compact. Application of (+) shows that A/i(K) is 
compact. 


A first approach to proving the compactness of II' /.(K *) in Theorem 6.53 is to 
pursue an analogy with the above proof for A/i(K ) by showing that multiplicative 
analogs of («) and («*) from that proof are valid here: 


I= u(K*)1(Sc0), 
U(R*) = (K*) N1(Soo). 
The second of these formulas is fine and is easily proved: The inclusion (R*) C 
U(K *)OI1(So9) is clear. For the inclusioni(R*) D 1(K*) NICS), let y bea mem- 


ber of K* such that (y) is in I(S..). Then |t,(y)|, = 1 for all nonarchimedean 
v, and it follows that y and y~! are in R, hence that y is in R*. 


10. Adeles and Ideles 393 


The difficulty is that an equality I = U(K *) I(Soo) holds if and only if the ring 
R of algebraic integers in K is a principal ideal domain. Let us elaborate on this 
point, since we will be led by it to the relationship between ideles and the ideal 
class group that makes ideles useful. 

Let us enumerate the nonzero prime ideals of R as P;, P2, ... in some fashion. 


As was mentioned in Section 2, each nonzero fractional ideal J in K has a finite 


: nee kj k, 
unique factorization of the form J = P."'--- P; 


1 n , where k;,,..., k;,, are integers. 
The mapping that carries / to the tuple (a;);>; with aj = k;, when j = i and 
aj = 0 when j is notin {k;,,..., ;,,} is a group isomorphism W from the group Z 
of fractional ideals onto a free abelian group Br Z of countably infinite rank. 
Some of these fractional ideals are of the form x R for some x € K*, and they are 
the principal fractional ideals. They form a subgroup ? of Z that is isomorphic 
to K*, and the quotient Z/P is isomorphic to the ideal class group of K, as was 
shown at the end of Section 2. Theorem 5.19 says that the group Z/P is a finite 
group; its order is the class number of K. 

Meanwhile, suppose that (x,), is a member of the group I of ideles. To 
each nonarchimedean place v, Corollary 6.8 associates a unique nonzero prime 
ideal, which we write as Pj(,) for a function i(-). If gy = |R/Piqy|, then the 
relationship between the valuation ord,(-) and the normalized absolute value 
associated to Pi) is |X|, = 9, ord) Since (xy), is an idele, there are only 
finitely many nonarchimedean v’s for which ord, (x,) is not 0. We can therefore 
map (x,), into the tuple of integers (ord, (x,)), and compose with Y—! to obtain 
a homomorphism of the group I into the group Z of fractional ideals. In more 
detail, the mapping from I to Br, Z is given by (x,)y  (aj)j>1 With aja) = 


im ‘ 
tm 


ord,(x,), and then Y~! interprets this sequence of integers as the exponents of 
the appropriate prime ideals. Since any association of members of K* at finitely 
many nonarchimedean places can be extended to an idele by making the idele 
be 1 at the remaining places, this homomorphism of I into Z is onto 7. 

Now suppose that the given idele (x,), is of form (x) for some x in K%*. 
Then the procedure for mapping this idele to a product of powers of the nonzero 
prime ideals of R is the same as the procedure for decomposing the fractional 
ideal x R as a product of powers of nonzero prime ideals of R. Consequently our 
homomorphism descends to a homomorphism 


I/i(K*) — L/P 


of the idele class group I if .(K *) onto the (finite) ideal class group Z/P. This 
is the fundamental fact about the ideles; the displayed homomorphism in effect 
says that the idele class group refines the information in the ideal class group. 
The subject of class field theory shows that this refined information is useful. 
Under the homomorphism of I onto Z, the kernel consists exactly of I(S.0), 
the ideles whose components at each nonarchimedean place v are in R*. Thus 


394 VI. Reinterpretation with Adeles and Ideles 


I/1(S3) — Lis an isomorphism. Taking into account the effect on 1(K *), we 
obtain an isomorphism 


I/ ((K*) [(Soo)) = L/P. 


Returning to our hoped-for equality I = t(K*) I(Soo) and comparing with the 
displayed isomorphism, we see that Il equals 1(K *) 1(S.0) if and only if Z = P. 
Equality Z = P holds if and only if every fractional ideal of K is principal, if and 
only if every ordinary ideal of R is principal. 

Thus we see why a direct analog of the proof of Theorem 6.52 does not work 
for Theorem 6.53. But at the same time we obtain information about how to give 
a correct proof. We saw that factoring I/1(K *) by I(S,,) leads to the finite group 
Z/P. We shall see that if we factor I/1(K *) by a suitably larger group I(S) with 
S still finite, then the quotient is the trivial group. An indication of this fact was 
in Problems 19-23 at the end of Chapter V, which showed that if we localize R 
at a large enough finite set of nonzero prime ideals, then the result is a principal 
ideal domain. In adelic/idelic terms the corresponding procedure is to enlarge 
Soo to a suitable finite set S containing S.. and to replace 1(S..) by I(S$); this 
enlargement has the effect of replacing R* by K;* at finitely many places v in 
considering what happens to ideals, and this is exactly what the localization in 
those problems accomplishes. Thus for a suitable finite set S containing S.., we 
will have an isomorphism 


I/(e(K*) 1(S)) = {1}; 
in other words, 
I =.(K*)I(S) 


for a suitable finite set S containing Sy. 

One final remark is needed, and then we are ready to carry out the proof of 
the compactness of II!/1(K*). The remark is that we always have at least one 
archimedean place, and adjusting an idele suitably at one archimedean place 
can change it from being in I to being in the subgroup I! of ideles for which 
I], l*ol, = 1. The members of 1(K%*) are already in this subgroup, but the 
members of I(S) need not be. Thus we replace I(S) by 1(S) NI! = I'(S), and 
the above equality becomes 


I’ =1(K*)1(S) 
for a suitable finite set S. 
PROOF OF COMPACTNESS OF I! /.(K™*) IN THEOREM 6.53. Let S be as above. 
Since I! = .(K*) I'(S), the Second Isomorphism Theorem gives 
I /u(K*) = 1'(S)/@(K*)I'(S)). (*) 
We shall prove that the right side is compact. 


10. Adeles and Ideles 395 


Let T be the complement of S,, in S, and define 


Qr= [] Ky, 22 = TT Ke, AS= TLRS, A; =D RY. 
VESoo veT veT véS 
If E is any subset of I(S), E ' will denote the set of members of E of total 
absolute value 1. Thus for example, (27)! is the set of tuples (xy) yes,, with 
sess, Xl, =l. 

Let ® : K* — QF be the mapping given in Section 9. Each member u of the 
group of units R* has the property that |u|,, = 1 for every nonarchimedean place 
v. Then it follows from the Artin product formula (Theorem 6.51) that ® carries 
R™ into (Q*)!. One of the two key ingredients in the proof of Theorem 6.51 is 
the observation that 


(Q*)'/®(R*) is compact. (4) 


In fact, Qy is a product of r; copies of R* and rz copies of C*. The function 
Log : QF — R"*” given by 


Log(x], .-- 5 Xs Xr tds ees Xr trot) 


= (log Ixilp; ae) log Xr, lies log [Xr41 lo re) log IXrtnle) 


is a continuous homomorphism of Qf onto R”'*”, and its kernel is compact, 
being the product of r; two-element groups and r2 circles. The image of (7)! is 
a hyperplane, and the proof of the Dirichlet Unit Theorem (Theorem 5.13) shows 
that Log(Q*)!/Log@(R*) is compact. Then (**) follows. 

The other key ingredient is the finiteness of the class number of K, which was 
proved as Theorem 5.19. Let h be this class number. For each v in T = (S,0)°, let 
P,, be the corresponding nonzero prime ideal in R. The ideal P’ in R is principal, 
and we let zr, be a generator. This element has the properties that K* /t, (ty)2Ry 
is compact and that |l,(7,)|,, = |Zvl,, = 1 for all nonarchimedean v’ with 
v’ # v. Let 

2) = I] ly (Ty) Ry; 
veT 
this is a subgroup between Aj and Q2 such that Q22/ ZX is compact. Let IT be the 
subgroup of K* given by IT = [J,<7 me 

The group 1(I1) is certainly a subgroup of 1(K ~), and the fact that |7,|,,, = 1 
for v’ ¢ S implies that .(I1) is contained in I'(S). Each member of 1(R*) has 
all nonarchimedean absolute values equal to 1, and consequently we have an 
inclusion 1(R*)i(T]) € o(K*)I'(S). In view of (*), I'(S)/((K*) I'(S)) is a 
homomorphic image of 


I'(S)/(e(R* «CD ({1} x AX x A3)), (1) 


396 VI. Reinterpretation with Adeles and Ideles 


and it is therefore enough to prove that (+) is compact. 
The members of 1(R*) have all nonarchimedean absolute values equal to 1 
and consequently 


tR Yh Ay KAA) HOR KAS & AT, 
Therefore the quotient of (+) by 
I'(S)/(CD(Qz)! x AF x AF)) (+4) 
is isomorphic to 
I'(S) / (CT) (®(R*) x AX x As) /T(s)/(camagi)! x AS AF )); 
which in turn is isomorphic to 
(eC) (2%)! x AX x AX) /(UCD(@(R*) x AX x AZ), 
which is a homomorphic image of 
(COT) XAT & AT) /(O(R*) «AT XAT) = OF) (OR): 


The right side is compact by («*), and therefore it is enough to prove that (++) is 
compact. 
Let us check that 


UTI) (Q*)! x AX x AX) = (Q* x Dy x Az)!. (t) 


The inclusion C is immediate. Thus suppose that ((@y)ves,,, (Fv)veTs (Sv) ves) 
lies in the right side of (¢). Since (oy)ver lies in Xz, there exists an ele- 
ment zo in I such that r, = l,(a9)~!o, lies in R, for all v € T. De- 
fine (@,)yes,, in QF by a, = ly(119)~!@,. For a suitable (5) vues, we then 
have L(T9) ((@),)veSoo> (Ty )ver> (8), ues) = ((@v)veSoo> (Ov)ver: (5v)v¢s)s and (4) 
is proved. 

Combining (+) and (+7), we see that it is enough to prove that 


I'(S)/(QF x D2 x A3)! (£4) 
is compact. The inclusion of I'(S) into I(S) induces a homomorphism 
I'(S)/(Q* x Xp x A3)' > 1(S)/(QF x Zp x Az) (§) 


that is evidently one-one. But it is also onto because if vg is an archimedean 
place and if (x,)y is given in I(S), then we can adjust (x,,) in such a way that 
the replacement (x,), has absolute value 1. The adjustment is by a member of 
QF x {1} x {1}, and thus (§) is onto. The right side of (§) is 


(QT x Q2 x A3)/(QT x Ly x A3) = Qe/ Xo, 


and we have arranged that this is compact. Consequently (£4) is compact, and 
the proof is complete. 


11, Problems 397 


11. Problems 


1. If F is a complete field with a nonarchimedean absolute value and if )°°° , an is 
an infinite series whose terms a, are in F’, prove that the series converges in F if 
and only if lim, a, = 0. 
2. Let the 2-adic absolute value be imposed on Q. Theorem 6.5 shows that Z is 
dense in the subring of Q consisting of all rationals with odd denominator. 
(a) Find a sequence of integers converging in this metric to ;- 
(b) Generalize the result of (a) by finding an explicit sequence of integers 
converging in this metric to any given rational ab~!, where a and b are 
nonzero integers with b odd. 


3. For the Dedekind domain R = Z and its field of fractions K = Q, the ring of 
units R* is just {+1}, and the set of archimedean places is just Soo = {00}. The 
formula 1(R*) = «(K*) NM 1CS..) of Section 10 therefore becomes {1(+1)} = 
UQ*)N (R* x Il, Lx). 

(a) Verify this formula directly. 

(b) Since Z is a principal ideal domain, the theory of Section 10 and the above 
remarks show that I = 1(Q*) (R* x J] ‘ ZX). Prove this formula by an 
explicit construction whose only allowable choice, in view of (a), is a certain 
sign. 

4. Let R be the Dedekind domain Z[./—5 ]. 

(a) Verify for each choice of sign that the ideals (1 + /—5, 3) and (1£/—5, 2) 
are prime and that (1 + J/—5,2) =(1— V5, 2). 

(b) Find the prime factorizations of the principal ideals (1 + /—5) and (3). 

(c) Let P be the prime ideal P = (1+ /—5 , 3), and let vp be the valuation of 
R determined by P. Prove that vp((1 + J/—5)/3) =0. 

(d) Lemma 6.3 shows that (1 + /—5) /3 can be written as the quotient of two 
members a and b of R with vp(a) = vp(b) = 0. Find such a choice of a 
and b. 


5. Let v be a discrete valuation of a field F’, let R, be the valuation ring, and let 
P, be the valuation ideal. It was observed after Proposition 6.2 that 1+ P’ isa 
group under multiplication for any n > 1. Prove forn > 1 that the multiplicative 
group (1 + P”)/(1 + P”*!) is isomorphic to the additive group P”/P”*! under 
the mapping induced by 1+ xt> x + path, 


6. Derive the finiteness of the class number of a number field K from the compact- 


ness of I /t(K *) given as Theorem 6.53. 


Problems 7-8 compare the topology on the ideles I = Ix of a number field K with 
the topology of the adeles A = Ax. The notation is as in Section 10. 


398 VI. Reinterpretation with Adeles and Ideles 


7. For each finite set S of places containing the archimedean places, exhibit the 
mappings I($) > Ky, forv € S and I(S) — R, for v ¢ S as continuous, and 
deduce that the inclusion I — A is continuous. 


8. Let pp be the nth positive prime in Z, and let x, = (Xn,v)y be the adele in Ag 
with Xx). = Py if v = py and x,y = Lifu ¥ pp. The result is a sequence 
{xn} of ideles in Ig. Show that this sequence converges to the idele (1), in the 
topology of the adeles but does not converge in the topology of the ideles. 


Problems 9-10 below assume knowledge from measure theory of elementary prop- 
erties of measures and of the existence—uniqueness theorem for translation-invariant 
measures (Haar measures) on locally compact abelian groups. The continuity in 
Problem 10a requires making estimates of integrals. 


9. Let G be a locally compact abelian topological group with a Haar measure 
written as dx, and let ® be an automorphism of G as a topological group, i.e., an 
automorphism of the group structure that is also a homeomorphism of G. Prove 
that there is a positive constant a(®) such that d(®(x)) = a(®) dx. 


10. Let F bea locally compact topological field, and let F * be the group of nonzero 
elements, the group operation being multiplication. 

(a) Let c be in F%, and define |c| to be the constant a(®) from the previous 
problem when the measure is an additive Haar measure and ® is multipli- 
cation by c. Define |0|7 = 0. Prove that c b> |c|- is a continuous function 
from F into [0, +00) such that |cjc2|r = |c1|F\col|F- 

(b) If dx is a Haar measure for F as an additive locally compact group, prove 
that dx /|x|r is a Haar measure for F'* as a multiplicative locally compact 
group. 

(c) Let F = R be the locally compact field of real numbers. Compute the 
function x +» |x|. Do the same thing for the locally compact field F = C 
of complex numbers. 

(d) Let F = Q, be the locally compact field of p-adic numbers, where p is a 
prime. Compute the function x +> |x|F. 

(e) For the field F = Q, of p-adic numbers, suppose that the ring Z, of p-adic 
integers has additive Haar measure 1. What is the additive Haar measure of 
the maximal ideal I of Zp? 


Problems 11-14 analyze the structure of complete valued fields whose residue class 
fields are finite, showing that the only kinds are p-adic fields and fields of formal 
Laurent series over a finite field. Let F be a complete valued field with a discrete 
nonarchimedean valuation, let v be the valuation, let R be the valuation ring, and let 
p be the maximal ideal of R. Suppose that the residue class field R/p is finite of order 
q = p” fora prime number p. Theorem 6.26 shows that the topology on F is locally 
compact. The normalized absolute value on F' corresponding to v is | - |; = gk). 
For some purposes it is convenient to separate the equal-characteristic case for F 
and R/p from the unequal-characteristic case. 


11. Problems 399 


11. Show in the unequal-characteristic case that F has characteristic 0. 


12. (a) Inboth cases, use Hensel’s Lemma to show that F has a full set of (¢ — 1)*t 
roots of unity and that coset representatives in F for R/p can be taken to 
be these elements and 0. Denote this subset of g elements of F by E. The 
subset E is of course closed under multiplication. 

(b) Show in the equal-characteristic case that E is closed under addition and 
subtraction and is therefore a subfield of F isomorphic to Fg. 


13. In the equal-characteristic case, write Fy for the subfield of F constructed in 

Problem 12b, and let ¢ be a generator of the principal ideal p, so that v(t) = 1. 

(a) Show that each nonzero element of R has a convergent infinite-series ex- 
pansion of the form Deane a,t* with all a; in IF, and that the value of v on 
such an element is the smallest k > 0 such that a, 4 0. 

(b) Show conversely that every series pen axt* with all ag in F, lies in R, and 
conclude that R = F,[[t]]. 

(c) Deduce that F is isomorphic to the field F,((¢)) of formal Laurent series 
over IF,, the understanding being that each such series involves only finitely 
many negative powers of f. 


14. Let F be an arbitrary complete valued field in the unequal-characteristic case. 
Since Problem 11 shows F to be of characteristic 0, F contains a subgroup Q’ 
isomorphic as a field to Q. 

(a) Show that the integer g = p” in Q lies in p. 

(b) Deduce that the number vp = v(p) is positive. 

(c) For each nonzero member ab~! p* of Q' for which a and b are integers 
relatively prime to p, show that v(ab~! p*) = kvo. 

(d) Deduce that (Q’, | - Fess is isomorphic as a valued field to (Q, | - Ip): 

(e) Let Q be the closure of Q’ in F, and explain why (Q, | - en) is isomorphic 

as a valued field to (Qp, | - |p). 

(f) Lett be a generator of p. With E as in Problem 12a, show that each member 


wm 


of F has a unique series expansion )-7°_y axt* with each a, in E and with 
N depending on the element, and show furthermore that every such series 
expansion converges to an element of F. 

(g) Let ci,...,c; with 7 = g”® be an enumeration of the elements ey a,t* 
with all a; in E. Show that to each element x in R corresponds some cj; 
such that p~!(x — c;) lies in R. Deduce that every element of R is the sum 
of a convergent series of the form )-7°9 cj, p*. 

(h) Explain how it follows from the previous part that F is a finite-dimensional 
vector space over oO, hence that F is a finite extension of the field Q,. 


Problems 15-19 continue the analysis in Problems 11-14 by examining finite sepa- 
rable extensions of complete valued fields whose residue class fields are finite. The 


400 VI. Reinterpretation with Adeles and Ideles 


goal is to prove Proposition 6.38 and Lemmas 6.47 and 6.48. Let F be a complete 
valued field with a discrete nonarchimedean valuation, let R be the valuation ring, and 
let p be the maximal ideal of R. Suppose that the residue class field R/p is finite of 
order gq = p” for a prime number p. Let K be a finite separable extension of F,, put 
n= [K : F], and let T be the integral closure of R in K. Theorem 6.33 shows that 
K is a valued field, that it has a unique nonzero prime ideal P, that the valuation ring 
of K is T, and that the valuation ideal is P. Write f for the dimension of T/P over 
R/p, so that T/P has order g/. Also, write e for the power such that pT = P°. It 
is known from Chapter IX of Basic Algebra that n = ef. In the equal-characteristic 
case, there is an especially transparent argument for proving Proposition 6.38, and 
Problem 15 gives that. Problem 16 gives a less transparent argument that handles 
both cases at once. The remaining problems address Lemmas 6.47 and 6.48. 


15. In the equal-characteristic case, let E be the subset of g elements of F described 
in Problem 12, and let E be the corresponding subset of g/ elements of K. 
Problem 13 shows that E is a field isomorphic to F, and that E ‘Is an extension 
field isomorphic to F,;. Let t be a generator in R of p, and let ¢ be a generator 
in T of P. Problem 13 shows that F = Fg((t)) and that K = Fys((r)). 

(a) Show that the set L of formal Laurent series in ¢ with coefficients from F, f 
is an intermediate field between F and K, so that L = Fy; ((¢)). 

(b) Why does it follow that the integral closure of R in L is U = F, f[[t]] and 
that the maximal ideal of U is fg = tU ? 

(c) Deduce that the residue class field of L is F,¢ of order q/ andthat oT = P®, 
so that the residue class degree of L/F is f and the ramification index of 
K/Lise. 

(d) How can one conclude that L/F is unramified and that K/L is totally 
ramified? 


16. In this problem no distinction is made between the equal-characteristic case and 
the unequal-characteristic case. Letk and kx be the residue class fields of F and 
K, and write kx = kr (@), where @ is a root of a monic irreducible polynomial 
g(X) in kp[X]. Let g(X) be a monic polynomial in R[X] that reduces modulo 
p to g(X). 

(a) Prove that there exists a € T witha + P =@ and with g(a) = 0. 

(b) With a as in (a), let L be the intermediate field between F and K given by 
L = F(a), let U be the integral closure of R in L, let so be the maximal 
ideal of U, and let ky = U/g9. Show that @ lies in U and that the member 
@ of kx is in the image of the natural field map k, > kx. 

(c) Conclude from (b) that k; = kx. 

(d) By comparing [L : K], the degrees of g(X) and g(X), and the indices e and 
f for K/F and L/F, prove that L has the properties required by Proposition 
6.38. 


11, Problems 401 


17. This problem applies to both the equal-characteristic case and the unequal- 
characteristic case. Let € be a member of T such that K = F(&), and let 
g(X) = X" + cy X"-! 4... +c, be its minimal polynomial over F. 


18. 


(a) 


(b) 


(c) 


(d) 


(e) 


(f) 


(g) 


Let N = er Ré*. This is a free R submodule of T of rank n with 
{1,é,...,&"—!} as an R basis. Define 

N ={y € K | Trx/r(xy) isin R for all x € M}. 
Put x; = &'~! for 1 < i < n. Why is there a unique yj in K with 
Trx/r(xiyj) = 4:7? Show that N is a free R module with {y,..., ya} 
as R basis. 
If A is a matrix in M,,(R) with det A = +1 and if z, = yi Ajxy;, why is 
ka Ree = Vea Rye? 
Let K’ be a splitting field of g(X) over F, and let €),..., & be the roots of 
g(X) in K’, with &; = &. Itis known from Basic Algebra that &),..., &, are 


distinct. Prove that 

3 g(X) 

J 8 (Ei) (X — £1) 
by observing that the difference of the two sides is a polynomial in X of 
degree at most n — | and all of &,..., &, are roots. 
Let o; be the field map that fixes F and carries F (€) into K’ in sucha way that 
oj (&) = &;. These mappings have the property that Trxr(&) = Viel oj (&) 
for all € € K. If h(X) is in the ring K [[X]] of formal power series over K, 
let h°/(X) be the polynomial obtained by applying o; to each coefficient, 
and extend Trx;r : K — F toa mapping of K[[X]] to F[[X]] by letting 
Trx/r A(X) = viel hi (X). By making the substitution X t 1/X in (c) 
and using the extended trace function just defined, show that 


xn _T xX 
an ee 


Write the identity in (d) out with power series, equate the coefficients of 
X, Xe ...,X” on the two sides, and deduce that Trx/r (E~! 9'(E)~!) 
equals 0 for 1 < k < n and equals | fork =n. 

Form the n-by-n matrix A with Ajj = Trx;r ((6'~!9(€)~')(E/7!)). The 
result of (e) shows that this matrix has all entries equal to 0 that lie above 
the off-diagonal i + 7 = n+ 1 and all entries equal to | that lie on the 
off-diagonal. By writing &'+/~* = é"g'+/—@+D—! and by substituting for 
&”, show that the remaining entries A;; lie in R. 

Combine the conclusions of (a), (b), and (f) to prove that NS (— IN. 


This problem continues with the notation of Problem 17 and assumes in addition 
that K/F is unramified, i.e., that f = n and e = 1. The objective is to prove the 
assertion of Lemma 6.48 that D(K /F) = T. 


402 


19. 


VI. Reinterpretation with Adeles and Ideles 


(a) Prove that the intermediate field L constructed in Problem 16 is K itself, 
that the polynomial g(X) is the minimal polynomial of @ over F,, and that 
K = F(a). 

(b) Let N = 777) Rak. Apply Problem 17 to obtain N = g/(a)~!N. Using 
the inclusion N C T, deduce that N D Tr and conclude that D(K /F je Cc 
ga) 7: 

(c) Prove that g’(a) is a unit in T, and deduce that D(K/F) = T. 

This problem continues with the notation of Problem 17 and assumes in addition 

that K /F is totally ramified, i.e., thate = n and f = 1. The objective is to prove 

the assertion of Lemma 6.47 that D(K /F) = P® with e’ equal to e — 1 if p does 
not divide e and with e’ > e if p divides e. Let E be the set of representatives in 

R of the members of R/p as constructed in Problem 12. Since f = 1, the set E 

is also a Set of representatives in T of the members of T/P. Let vx and vf be the 

respective discrete valuations of K and F, so that ur = nux | p by Proposition 

6.34. Let mz and A be respective generators of P and p. 

(a) Prove that if M is a field with a discrete valuation w and if x1,..., X, are 
elements of M with x; +--- +x, =0andm > 2, then the number of j’s 
for which w(x;) = min} <j<m w(x;) is at least 2. 

(b) Let g(X) = coX”" +c, X"~! +--+ +n with co = 1 be the field polynomial 
of z over F. Why are all the coefficients c; in R, and why is vx (c;) divisible 
by n for each j? 

(c) Taking into account that z is a root of its field polynomial and applying 
(a), show that there exist integers i and j withO <i < j <n such that 
j —i = vx (cj) — vx (cj) and that all other integers k with 0 < k < n have 
uK (cen *) > 1. 

(d) Using the divisibility conclusion of (b), show that g(X) is an Eisenstein 
polynomial relative to p in the sense that cg = 1, that all ofc), ..., c, lie in 
p, and that c, does not lie in oe 

(e) Conclude from (d) that g(X) is irreducible over F’, that g(X) is the minimal 
polynomial of z over F’, and that K = F(z). 

(f) For each k > O, apply the division algorithm to write k = ni + j with 
0 < j <n =e, and define y, = A'z/. Show that every member of T has 
a unique convergent series expansion as )\?-9 dy yx and that all such series 
expansions have sum in 7. 

(g) By rewriting the expansion in (f) suitably, show that {1, 7, ...,7”~'} is an 
R basis for the free R module T. 

(h) By applying Problem 17 with N = ee, Rr, prove that T = g'(x)'T, 
and deduce that D(K /F) = (g’(z)). 

(i) Computing g’(z) and applying the valuation v to it, show that uv(g’(z)) = 
e — 1 if v(e) = O and that v(g/()) = e if v(e) > 0. Explain how this 
conclusion proves Lemma 6.47. 


CHAPTER VII 


Infinite Field Extensions 


Abstract. This chapter provides algebraic background for directly addressing some simple-sounding 
yet fundamental questions in algebraic geometry. All the questions relate to the set of simultaneous 
zeros of finitely many polynomials in n variables over a field. 

Section 1 concerns existence of zeros. The main theorem is the Nullstellensatz, which in part 
says that there is always a zero if the finitely many polynomials generate a proper ideal and if the 
underlying field is algebraically closed. 

Section 2 introduces the transcendence degree of a field extension. If L/K is a field extension, 
a subset of L is algebraically independent over K if no nonzero polynomial in finitely many of 
the members of the subset vanishes. A transcendence basis is a maximal subset of algebraically 
independent elements; a transcendence basis exists, and its cardinality is independent of the particular 
basis in question. This cardinality is the transcendence degree of the extension. Then L is algebraic 
over the subfield generated by a transcendence basis. Briefly any field extension can be obtained by 
a purely transcendental extension followed by an algebraic extension. The dimension of the set of 
common zeros of a prime ideal of polynomials over an algebraically closed field is defined to be the 
transcendence degree of the field of fractions of the quotient of the polynomial ring by the ideal. 

Section 3 elaborates on the notion of separability of field extensions in characteristic p. Every 
algebraic extension L/K can be obtained by a separable extension followed by an extension that is 
purely inseparable in the sense that every element x of L has a power xP* for some integer e > 0 
with x?° separable over K. 

Section 4 introduces the Krull dimension of a commutative ring with identity. This number is 
one more than the maximum number of ideals occurring in a strictly increasing chain of prime ideals 
in the ring. For K[X,,..., X,] when K is a field, the Krull dimension in n. If P is a prime ideal in 
K[X1,..., Xn], then the Krull dimension of the integral domain R = K[X1,..., Xn]/P matches 
the transcendence degree over K of the field of fractions of R. Thus Krull dimension extends the 
notion of dimension that was defined in Section 2. 

Section 5 concerns nonsingular and singular points of the set of common zeros of a prime ideal 
of polynomials in n variables over an algebraically closed field. According to Zariski’s Theorem, 
nonsingularity of a point may be defined in either of two equivalent ways —in terms of the rank of a 
Jacobian matrix obtained from generators of the ideal, or in terms of the dimension of the quotient of 
the maximal ideal at the point in question factored by the square of this ideal. The point is nonsingular 
if the rank of the Jacobian matrix is n minus the dimension of the zero locus, or equivalently if the 
dimension of the quotient of the maximal ideal by its square equals the dimension of the zero locus. 
Nonsingular points always exist. 

Section 6 extends Galois theory to certain infinite field extensions. In the algebraic case inverse 
limit topologies are imposed on Galois groups, and the generalization of the Fundamental Theorem 
of Galois Theory to an arbitrary separable normal extension L/K gives a one-one correspondence 
between the fields F with K C F C L and the closed subgroups of Gal(L/K). 


403 


404 VII. Infinite Field Extensions 
1. Nullstellensatz 


Algebraic geometry studies the geometric properties of sets defined by algebraic 
equations. In the simplest case some field K is specified, the equations are 
polynomial equations in several variables with coefficients in K, and one seeks 
solutions to the system of equations with the variables taking values in K or some 
larger field. 

The nature of the subject is that even fairly simple-sounding geometric ques- 
tions require algebraic background beyond what is in Basic Algebra and the 
first six chapters of the present book. This chapter addresses the necessary 
background, largely from the theory of fields, for addressing fundamental ques- 
tions concerning existence of solutions, the dimension of the space of solutions, 
singularity of the solution set at a particular point, and effects of changing fields. 

The present section supplies background for the question of existence. We 
have a system of polynomial equations in variables with coefficients in K, and 
we are interested in simultaneous solutions in a given extension field L of K. A 
solution can be regarded as acolumn vector in L”. Think of the equations as of the 
form Fj(X1,..., Xn) = 0 with each F; a polynomial, and then the set of solutions 
is the locus of common zeros of the F;’s in L”. The locus of common zeros is 
unaffected by enlarging the system of equations by allowing all equations of the 


form )°, G; F; = 0 with each G; is arbitrary in K[X1,..., X,]; thus we may as 
well regard the left sides as all members of some ideal J in K[X,,..., X,]. The 
Hilbert Basis Theorem says that any ideal in K[X,,..., X,,] is finitely generated, 


and hence studying the common zero locus for an ideal is always the same as 
studying the common zero locus for a finite set of polynomials. 

A proper ideal need not have a nonempty locus of common zeros. For example, 
if K = R, then the single equation X? + Y* + 1 = 0 has no solutions in R’. 
Hilbert’s Nullstellensatz! is partly the affirmative statement that any proper ideal 
has a nonzero locus of common zeros under the additional assumption that K is 
algebraically closed. 


Theorem 7.1 (Nullstellensatz). Let K bea field, let K be an algebraic closure, 
and let n be a positive integer. Then every maximal ideal J of K[X1,..., Xn] 
has the property that K[X1,..., X,]/J is a finite algebraic extension of K, and 
in particular the maximal ideals of K[X,,..., X,] are of the form 


(X, —ay,..., Xn — an), 


where (a1, ..., dy) is an arbitrary member of K". Consequently if / is any proper 
ideal in K[X ,..., X,], then 


(a) the locus of common zeros of J in K” is nonempty, 


!German for “zero-locus theorem.” 


1. Nullstellensatz 405 


(b) any $3 in K[X,,..., X,] that vanishes on the locus of common zeros of 
I in K" has the property that f* is in J for some integer k > 0. 


Before coming to the proof, we mention an important corollary. 


Corollary 7.2. Let K be a field, let K be an algebraic closure, let n be a 
positive integer, and let J be a prime ideal in K[X,,..., X,]. Then J contains 
every polynomial in K[X,,..., X,] that vanishes on the locus of common zeros 
of Jin K[X,,..., Xn]. 


Proor. If f is a member of K[X,,..., X;,] that vanishes on the locus of 
common zeros of /, then (b) in the theorem shows that f* is in J for some k. 
Since I is prime, one of the factors of f* = f--- f lies in J. 


EXAMPLE FOR COROLLARY. Let K = L = C, and let J be the principal ideal in 
C[X, Y] generated by Y? — X(X + 1)(X — 1). Consider CLX, Y] as isomorphic 
to C[X][Y]. Asa polynomial in Y over C[X], p(X, Y) = Y7-— X(X+1)(X —-1) 
is irreducible because X (X + 1)(X — 1) is not the square of a polynomial in X. 
Since C[X, Y] is a unique factorization domain, p(X, Y) is prime. Therefore J = 
(p(X, Y)) is a prime ideal. The corollary says that every polynomial vanishing 
on the locus of points (x, y) € C? for which y* = x(x + 1)(x — 1) is the product 
of Y* — X(X + 1)(X — 1) and a polynomial in (X, Y). Consequently the ring 
of restrictions of polynomials to the locus for which y? = x(x + 1)(x — 1) is 
isomorphic to C[X, Y]/(Y¥* — X(X + 1)(X — 1). 


Theorem 7.1b has a tidy formulation in terms of the “radical” of an ideal. If 
R is a commutative ring with identity and / is an ideal in R, then the radical of 
I, denoted by 1, is the set of all r in R such that ré is in J for some k > 1. It is 
immediate that the radical of / is an ideal containing / and that /7 is proper if / 
is proper. If J is an ideal in K[X1,..., X,] andif f isin JT, then Dia isin J for 
some k > 0, and hence f vanishes on the locus of common zeros of 7. Theorem 
7.1b says conversely that any f vanishing on the locus of common zeros of J has 
f* in I for some k > 0. This means that f is in /7. We can therefore rewrite 
(b) in the theorem as follows: 


(b’) the ideal of all f in K[X,,..., X,,] that vanish on the locus of common 
zeros of I in K" is exactly V7. 


The proof of Theorem 7.1 will follow comparatively easily from the following 
two lemmas. 


Lemma 7.3. If K is a field and L is an extension field that is generated as a 
K algebra by n elements x1, ..., Xn, 1.¢., if L = K[x1,..., Xn], then every x; is 
algebraic over K. 


406 VII. Infinite Field Extensions 


REMARKS. Conversely if x;,..., x, are elements of an extension field L that 
are algebraic over K, then K (x,,...,%,) = K[x1,...,X,]. The reason is that 


K(x1,...,%) = K(%],...,Xn-1) (Qn) = K(X, .--, Xn—-1) Xn] 
K (x1, .- +, Xn—2) %n—-V) [Xn] = K(X, --- Xn—2) Xn-1] Xn] 
=) = Kl) +++ [Xn tn] = KI, .-., Xn). 


PROOF. We proceed by induction on n. For n = 1, if L = K[x,], then we 
know from the elementary theory of fields that x; is algebraic over K. 


For the inductive step, suppose that L = K[x,,...,x,]. Since L is a field, 
K(x1) © L, and hence L = K(x1)[x2,...,Xn]. By the inductive hypothesis 
applied to L and K (x1), the elements x2,..., xX, are algebraic over K (x1). To 


complete the proof, it is enough to show that x; is algebraic over K. 
Fix j > 2. The element x;, being algebraic over K (x1), satisfies a polynomial 
equation 
X”™ + Gy 1 X™ 1 4--- +a =0 


witha,_|,..., doin K (x;). Clearing fractions, we see that x; satisfies an equation 
bn X” + BX"! + +++ + by =0 


with b,,,..., 59 in K[x;] and b,, ~ 0. Multiplying through by pe * shows that 
x; Satisfies 


(bmX)” aE bib XY feeet bo(bm)" | —0, 


and we see that b,,x; is integral over the ring K[x,]. Let us write c; for the 
element b,, € K[x,] that we have just produced for this /. 

In the case of j = 1, we can use m = | and ay = —x; in the above argument, 
and we are then led toc, = x. Ifx!! tee xin is any monomialin K[x,,...,x,] and 
if / is defined as / = max(/;,...,/,), then the fact that the integral elements over 
K [x] form a ring implies that (c, - - enix! + »xln is integral over K [x,]. Hence 
for any f in K[x,...,Xn], (ci--- Cay f is integral over K[x,] for a suitable 
integer / = /[(f). Since K(x1) C K[x1,...,Xn], this conclusion applies in 
particular to any member f of K (x1). 

The ring K[x;] is a principal ideal domain and is therefore integrally closed 
in its field of fractions K(x). For f in K(x,), we have seen that (c; ---c,)! f 
is integral over K[x,] for some / = 1(f). The element (c, --+c,)! f is in K (x1), 
and the integral-closure property therefore implies that (c, -- Cy) f isin K[x1]. 

Consequently there exists a fixed element / of K [x;] such that every element f 
of K (x1) is of the form g/h! for some g in K [x;] and some integer/ > 0. We apply 
this observation to f = g(x,)7! for each irreducible polynomial g(X) in K[X], 
and we obtain g(x,)g = h! with g and / depending on g(X). If x; is transcen- 
dental over K’, this equality implies the polynomial identity g(X)g(X) = h(X)'. 


1. Nullstellensatz 407 


Consequently every irreducible polynomial g(X) divides h(X). If K is infinite, 
this is a contradiction because there are infinitely many distinct polynomials X —a 
in K[X]; if K 1s finite, this is a contradiction because there exists at least one 
irreducible polynomial of each degree > 1. We arrive at a contradiction in either 
case, and therefore x; is algebraic over K. This completes the induction and the 
proof. 


Lemma 7.4. Let K be a field, and let L be an algebraic extension of K. If 
I is a proper ideal in K[X,,..., Xn], then JL[X1,..., Xn] is a proper ideal in 
L[X,,..., Xn]. 


REMARK. As usual, the notation /L[X,,..., X,] refers to the set of sums of 
products of elements of J and elements of L[X1,..., Xn]. 
PROOF. First let us identify the integral closure of K[X,,..., X;,] in the field 


L(X1,..., Xn) as L[X1,..., Xn]. The ring L[X1,..., Xn] is a unique factor- 
ization domain, and Proposition 8.41 of Basic Algebra shows that it is integrally 
closed. Consequently the integral closure of K[X1,..., Xn] in L(X1,..., Xn) 1s 


contained in L[X1,..., X,]. On the other hand, the integral closure of 
K[X,,..., Xn] in L(X,..., X,) contains L because L/K is algebraic, and 
it contains each X;._ Therefore it contains L[X,,..., X,] and must equal 
L[Xy,..., Xn]. 


Now we apply Proposition 8.53 of Basic Algebra to the ring K[X1,..., Xn], 
its field of fractions K(X,,..., X,), the extension field L(X1,..., X,), and 
the integral closure L[X,,..., X,] of K[X,,..., X,] in L(X,..., X;). The 
proposition says that if P is any maximal ideal of K[X,,..., X,], then the ideal 
PL[X,,..., Xn] 1s proper in L[X,,..., X;,]. This result is to be applied to any 
maximal ideal P of K[X,,..., X;,,] that contains [. 


PROOF OF THEOREM 7.1. Let J be a maximal ideal in K[X),..., X,]. Then 
L=K[X,..., Xy]/J isa field. Hence L = K[x1,..., X,] is a field if the x;’s 
are defined by x; = X; + J. Lemma 7.3 shows that each x; is algebraic over K, 
and the first conclusion of the theorem follows. 

When this conclusion is applied to K instead of K, then the fact that K is 
algebraically closed implies that each x; lies in the cosets determined by K , i.e., the 
cosets of the constant polynomials. Consequently for each j, there is an element 


aj in K such that x; — a; lies in J. Then it follows that (X1 — a1, ..., Xn — Gn) 
is contained in J. Since the ideal (X; — aj,..., Xn — dy) 1s maximal, J = 
(X; — a,...,Xn — a). This proves that the maximal ideals are as in the 


displayed expression in the theorem. 
To prove (a), we apply Lemma 7.4 to the ideal / in K[X1, ..., Xn] and to the al- 
gebraic extension K of K. The lemma produces a proper ideal of K[X1,..., Xn] 


408 VII. Infinite Field Extensions 


containing J, and we extend it to a maximal ideal J of K[X,,..., X;,]. From the 
previous paragraph of the proof, J is of the form J = (X; —a,,..., X, —a,) for 
some (qa],...,@,) in K". The ideal J is therefore identified as the kernel of the 
evaluation homomorphism of K[X1, ..., Xn] at the point (a1,...,a,). Every 
member of J thus vanishes at (a1,...,@,), and the same thing is true of every 
member of J. This proves (a). 

For (b), let J be a proper ideal in K[X,,..., X,], and let f be as in (b). Intro- 
duce an additional indeterminate Y, and let J be the ideal in K[X,,..., X;, Y] 
generated by J and fY — 1. If some point (x,,...,x,, y) lies on the locus of 
common zeros of J in K"*!, then (x,, ..., x,) lies on the locus of common zeros 
of J in K”, since J C J; thus f(%1,---;Xn) = 0, since f is assumed to vanish 
on all common zeros of J in K". Consequently f(x;,...,%,)y —-1=—140, 
and we find that f(X,,..., X,)Y — 1 does not vanish on the locus of common 
zeros of J in K"*t!, contradiction. We conclude that no point (x1, ...,Xp, y) lies 
on the locus of common zeros of J in K"*!. By (a), we see that 


J=K([X,...,Xn, Y]. (*) 


Let us write X for the expression X;,..., X,. Then (*) implies that 


1 = 7 pilX, Yai(X) +4(X, YUFOOY — 1) (vx) 


i=1 


for some g;,..., g, in J and some pj,..., p, andg in K[X, Y]. Let w be the 
substitution homomorphism of K[X, Y] into K(X) that carries K into itself, X 
into itself, and Y into f(X)~!. Application of y to (**) gives 


= Y pilX, F(X)")gi(X), (1) 
i=l 


since v( (OY - 1) = 0. If Y* is the largest power of Y that appears in any of 
the polynomials p;(X, Y), then we can rewrite (}) as 


FIX) =O (FOOK PAX, FOO) g:(X) 
i=l 


and exhibit f(X)* as the sum of products of the members g; of J by members of 
K[X]. Thus f(X)* is in J, and (b) is proved. 


2. Transcendence Degree 


Let K be a field, and let L be an extension field. The algebraic construction in 
this section will show that L can be obtained from K in two steps, by a “purely 
transcendental” extension followed by an algebraic extension. The number of 


2. Transcendence Degree 409 


indeterminates in the first step (or the cardinality if the number is infinite) will be 
seen to be an invariant of the construction and will be called the “transcendence 
degree” of L/K. 

Before coming to the details, let us mention what transcendence degree will 
mean geometrically. Suppose that the field K is algebraically closed, suppose 
that 7 is a prime ideal in K[X,,..., X,], and suppose that V is the locus of 
common zeros of J. Corollary 7.2 shows that J is the set of all polynomials 
vanishing on V, and thus the integral domain K[X,,..., X,,]/J may be regarded 
as the set of all restrictions to V of polynomials. If L is the field of fractions of 
K[X,,..., Xn]/J, then the transcendence degree of L/K will be interpreted as 
the “number of independent variables” or “dimension” of the locus V. 

Now we can make the precise definitions. Let K be a field, and let L be 
an extension field. A finite subset x;,...,x, of L is said to be algebraically 
independent over K if the ring homomorphism K[X1,..., Xn] — L given by 
fr ft, ..., Xn) is one-one.” Otherwise it is algebraically dependent. 


EXAMPLE. Let K = C, and let p(X, Y) = Y? — X(X + 1)(X — 1). The 
principal ideal J = (p(X, Y)) was shown to be prime in C[X, Y] in the example 
with Corollary 7.2. Therefore C[X, Y]// is an integral domain. Let x and y be 
the cosets x = X + / and y = Y + /. If L denotes the field of fractions of 
CLX, Y]/7, then we may regard x and y as members of L. The subset {x, y} of L 
is algebraically dependent because the polynomial p(X, Y) maps to 0 under the 
substitution homomorphism of C[X, Y] into L with X Bh x and Yb y. 


A subset S of L is called a transcendence set over K if each finite subset of 
S is algebraically independent over K. A maximal transcendence set over K is 
called a transcendence basis of L over K. For each transcendence set S of L 
over K, we write K (S) for the smallest subfield of L containing K and S. If some 
transcendence basis S has the property that K(S) = L, then L is said to be a 
purely transcendental extension of K; in this case it follows from the definitions 
that S is a transcendence basis of L over K. 


EXAMPLE, CONTINUED. With K and L as inthe example above, the sets S = {x} 
and S = {y} are transcendence sets over K = C. It is not hard to see that {x} is a 
transcendence basis of L over K. Actually, if z is any member of L that is notin C, 
then {z} is a transcendence set over C. The reason is that C is algebraically closed; 
hence either z is transcendental over C or else z lies in C. Lemma 7.6 below shows 
that any transcendence set of L over C can be extended to a transcendence basis, 
and Theorem 7.9 shows that all transcendence bases of L over C have the same 
cardinality. It follows that if z is any member of L that is not in C, then {z} is a 


By convention the empty set is algebraically independent over K. 


410 VII. Infinite Field Extensions 


transcendence basis of L over C and that every transcendence basis of L over C 
is of this form. The two-element set {x, y} cannot be a transcendence set by this 
reasoning, but we can see this conclusion more directly just by observing that 
{x, y} was shown in the example above to be algebraically dependent. 


Shortly we shall establish the existence of transcendence bases in general. If 
S is a transcendence basis and if K’ is defined to be K(S), then we shall show 
that L is algebraic over K’. The subfield K’ of L depends on the choice of S, but 
there is a uniqueness theorem: the cardinality of a transcendence basis of L/K 
is independent of the particular transcendence basis. 


Lemma 7.5. Let L/K be a field extension, let S be a transcendence set of 
L over K, let K(S) be the subfield of L generated by K and S, and let x be an 
element of L not in S. Then S’ = S U {x} is a transcendence set of L over K if 
and only if x is transcendental over K (S). 


PROOF. Suppose that x is transcendental over K(S) and is not in S. Let n 
distinct elements x,,...,%, of S’ be given. If these are all in S, then f te 
f (1, .-+,Xn) is one-one because S is a transcendence set. Suppose that one of 
the n elements is x; say x, = x. If f is in the kernel of the homomorphism 
tte f@1,..., Xn), 1e., if fr, ..., Xn) = 0, then x is a root of the polynomial 
g(X) = f(,...,Xn-1, X) in K(x1,...,Xn-1)[X]. Since x is assumed to 
be transcendental over K(S), the polynomial g must be 0. If we expand the 
polynomial f in powers of X as 


F (Bi, 0.0, Xt, X) = C(K1, Xp) KP e+ beeg(K1,..., Xn), 


the condition that g be O says that cj(x1,...,Xn-1) = 0 for all j. Since the set 
{x1, ..., Xn-1} is algebraically independent, we see thatc; = 0. Therefore f = 0. 
Hence {x1, ..., X,} is algebraically independent, and S’ is a transcendence set. 

Conversely suppose that S’ is a transcendence set of L over K. We are to 
show that the only polynomial F(X) in K(S)[X] such that F(x) = 0 is the 0 
polynomial. Since only finitely many coefficients of F are in question, we may 
view F asin K ({x,, ..., X,})[X] for some finite subset {x;,..., x,} of S. Clearing 
fractions, we can write F as 


F(X) = d(x1, cs. AGEY “(oiGig . er) + ee + co(%1, . ea) 
for suitable polynomials d, co,...,c; in K[X1,..., Xn] withd(x, ...,X%n) #0. 
Define 

F(X, say Mh IC ey ees a Oe ais Sas Ge aie Me 
The condition F(x) = 0 yields F(x, ...,Xn,X) = 0. Since {x1, se Mik} 
is by assumption algebraically independent over K, we see that F = 0. Thus 
cj(X1,..., Xn) = O for all j, and consequently cj(x1,...,%n) = O for all j. 
Therefore F = 0, as required. 


2. Transcendence Degree 411 


Lemma 7.6. If L/K is a field extension, then 


(a) any transcendence set of L over K can be extended to a transcendence 
basis of L over K, 

(b) any subset of L that generates L as a field over K has a subset that is a 
transcendence basis of L over K. 


In particular, there exists a transcendence basis of L over K. 


PROOF. For (a), order by inclusion upward the transcendence sets containing 
the given one. To apply Zorn’s Lemma, we need only show that the union of a 
chain of transcendence sets in L over K is again a transcendence set. Thus let 
finitely many elements of the union of the sets in the chain be given. Since the sets 
in the chain are nested, all these elements lie in one member of the chain. Hence 
they are algebraically independent over K, and it follows from the definition that 
the union of the sets in the chain is a transcendence set. By Zorn’s Lemma there 
exists a maximal transcendence set, and this is a transcendence basis by definition. 

For (b), we argue in the same way as for (a). Let the given generating set 
be G. Order by inclusion upward the transcendence sets that are subsets of G. 
The empty set is such a transcendence set. As with (a), the union of a chain of 
transcendence sets in L over K is again a transcendence set, and the union is 
contained in G if each individual set is. By Zorn’s Lemma there exists a maximal 
transcendence subset S of G. To complete the proof, it is enough to show that 
every member of G is algebraic over K(S). Let x be in G. We may assume that 
x is not in S. By maximality, SU {x} is not a transcendence set. Then Lemma 
7.5 shows that x is algebraic over K (S). Hence S is the required transcendence 
basis. 

For the final conclusion we apply (a) to the empty set, which is a transcendence 
set of L over K. 


Theorem 7.7. If L/K is a field extension, then there exists an intermediate 
field K’ such that K’/K is purely transcendental and L/K’ is algebraic. 


PROOF. Lemma 7.6 produces a transcendence basis S for L/K. Define K’ 
to be the intermediate field K(S) generated by K and S. Then K’ is purely 
transcendental over K by definition. If x is a member of L that is not in K’, then 
S U {x} is not a transcendence set of L over K by maximality of S$, and Lemma 
7.5 shows that x is algebraic over K(S) = K’. Hence L is algebraic over K’. 


As was mentioned earlier in the section, the intermediate field K’ with the 
properties stated in the theorem is not unique. In the example above with K = C 
and with L equal to the field of fractions of CLX, Y]/(Y? X(X + 1)(X 1), 
K’' can be any subfield C(z) with z not in the subfield C. For an even simpler 
example, let K be arbitrary, and let L = K(x) be any purely transcendental 


412 VII. Infinite Field Extensions 


extension. Use of the transcendence basis {x} of L over K leads to K’ = L in 
the proof of Theorem 7.7. But {x7} is another transcendence basis, and for it we 
have K’ = K(x). The extension L/K’ is algebraic because x is a root of the 
polynomial X? — x? in K (x?)[X]. 

We turn to the matter of showing that any two transcendence bases of L over 
K have the same cardinality. We shall make use of the following result, which 
was proved at the end of the appendix of Basic Algebra: 


Let S and E be nonempty sets with S infinite, and suppose that to 
each element s of S is associated a countable subset FE, of E in such 
a way that E =)... Es. Then card E < card S. 


ses 


In our application of this result, the sets E,. will all be finite sets. 


Lemma 7.8 (Exchange Lemma). Let L/K be a field extension. If E is any 
subset of L, let K (E) be the subfield of L generated by K and E, and let K(E) 
be the subfield of all elements in L that are algebraic over K (£). If E U {x} and 
E U {y} are finite transcendence sets of L over K and if x lies in K (E U {y}) but 
not K(£), then y lies in K(E U {x}). 


PROOF. The condition that x lie in K (E U {y}) implies that there exist a finite 


subset {x1,...,x,} of E anda member f of K(X,,..., X,, Y)[Z] such that 
f(%1,---;Xn, y, Z) #0 but f(%1,---;Xn, y, Xx) = 0. (*) 
Clearing fractions, we may assume that f lies in K[X,,..., Xn, Y, Z]. Expand 


f in powers of Y as 


1 


f(X%,...-, Xn, VY, Z) = 2G Aistsa Aas ZIP" 
j= 
Since f(%1,...,%,y,Z) 4 O by (&), at least one of the coefficients, say 
ci, has to satisfy ci(x1,...,Xn, Z) 4 O. Lemma 7.5 shows that x is tran- 
scendental over K(E), and therefore c;(x,,...,%,,x) 4 O. Consequently 
S(%1,--+;Xn, Y,x) is nonzero. Since f(x1,...,Xn,y,x) = O by (*), y is 
algebraic over K ({x1,..., Xn, X}). Therefore y lies in K (E U {x}). 


The statement of Lemma 7.8 defines an operation E +» K(E) on subsets of L. 
Because an algebraic extension of an algebraic extension is algebraic, applying 
this operation a second time does nothing new: K ( K(E )) = K(E). We shall 
make use of this fact in the proof of Theorem 7.9 below. 


Theorem 7.9. If L/K is a field extension, then any two transcendence bases 
of L over K have the same cardinality. 


2. Transcendence Degree 413 


REMARKS. The cardinality is called the transcendence degree of L/K. For 
applications to algebraic geometry, the situation of interest is that this cardinality 
is finite, but we give a complete proof of the theorem anyway. 


PROOF. First suppose that L/K has a finite transcendence basis B. Let|B| =n. 
Let B’ be another transcendence basis, and let m = |B B’|. We prove that 
|B’| = |B| by induction downward on m. The base case of the induction is that 
m =n. Then B C B’, and we must have B = B’ by maximality of B. 

For the inductive step, suppose that m < n and that |B’| = |B| whenever 
|B B'| > m+ 1. We write the elements of B in an order such that B = 
{x1,...,%,} and BM B’ = {x,...,Xm}. Lemma 7.5 shows that x41 is tran- 
scendental over K(B — {Xm+41}). Hence x4; does not lie in K(B — {xm4+1}). 
A second application of Lemma 7.5 shows that L = K(B’). The inclusion 
B' C K(B — {xm+1}) is impossible because otherwise we would have 


L =K(B) CK(KB— imu) = KB — ini: 


Hence there exists an element y of B’ that does not lie in K(B — {xm41}). A 
third application of Lemma 7.5 shows that (B — {x+1}) U {y} is a transcendence 
set for L/K. Since y lies in L = K(B), the Exchange Lemma (Lemma 7.8) 
shows that x41 lies in K ((B — {Xm41}) U { y}). Consequently B is contained in 
K((B — {Xm41}) U {y}), and L = K((B — {Xm41}) U {y}). A fourth application 
of Lemma 7.5 shows that the transcendence set By = (B — {Xn41}) U {y} is a 
transcendence basis. The set B, has n elements, and the inclusion B} M B’ D 
{x1,...,;Xm, y} shows that |B; N B’| => m+ 1. The inductive hypothesis shows 
that |B’| = |B,|, and therefore |B’| = |B|. This completes the proof under the 
assumption that L/K has a finite transcendence basis. 

We may now suppose that L/K has no finite transcendence basis. Let B be a 
transcendence basis of L/K; existence is by Lemma 7.6. To each element x of 
L, we shall associate a canonical finite subset E, of L. 

Since the element x is algebraic over K (B), use of the field polynomial of x 
over K(B) shows that x is algebraic over K (E) for some finite subset E of B. 
Let Eo be such a finite set E with the smallest cardinality; the set Eo will be 
the canonical finite subset EF, that we seek. To show that Eo is canonical, we 
show that whenever x lies in K (E) for some finite subset E of B, then Eo C E. 
Arguing by contradiction, suppose that y is a member of Eo that is not in FE, and 
define E,; = Eo — {y}. By minimality of | £o|, x does not lie in K (E,). However, 
x does lie in K(£; U {y}). Application of the Exchange Lemma shows that y 
lies in K (E, U {x}). Since 


K(E; U{x}) C K(E, UK(E)) C K( K(E) UV E)) = K(E) UE), 


414 VII. Infinite Field Extensions 


y lies in K(E, U E). Since y is in B but is not in E; U E, Lemma 7.5 shows 
that y is not algebraic over K(E, U E), and we arrive at a contradiction. This 
completes the proof that whenever x lies in K (EF) for some finite subset E of B, 
then Eg C E. Hence Eo is canonical. 

For each element x of L, we let E, be the finite subset of B constructed in the 
previous paragraph. Then we have a well-defined map of L to the set of all finite 
subsets of B given by x > E, C B. Now let B’ be a second transcendence basis 
of L/K, and restrict the map from L to B’. Taking S = B’ and E = ),.-p, Ex 
in the indented result quoted just before Lemma 7.8, we find that 


card ( U E;) < card(B’). (*) 
xeB’ 


On the other hand, any x in B’ lies in K(E,) by definition of E,,. Hence B’ C 
K ( See E,). Applying the operation K (- ) to both sides gives 


LS KUBY KR KU ep B) ) = Bh ee). 


Since L).<,, Ex is a subset of B and since a proper subset of B cannot be a 
transcendence basis of L/K, we conclude that 


B= Uses Ex. 
Consequently 
card B = card(U,¢,7 Ex). 


In combination with (x), this equality implies that card B < card B’. Reversing 
the roles of B and B’ gives card B’ < card B. Therefore card B = card B’ by the 
Schroeder—Bernstein Theorem.° 


3. Separable and Purely Inseparable Extensions 


Thus far in this book, we have been interested in the detailed structure of algebraic 
field extensions only when they are separable. For applications to algebraic 
geometry, however, algebraic extensions that are not separable arise and even 
play a special role. Thus it is essential to have some understanding of their 
nature. 

Let us review the material on separability in Section IX.6 of Basic Algebra. 
Let K bea field. An irreducible polynomial in K |X] is defined to be separable if 
it splits into distinct first-degree factors in its splitting field over K. Let L/K be 
an algebraic extension of fields. An element of L is defined to be separable over 
K if its minimal polynomial over K is separable. Elements of L that fail to be 
separable over K are called inseparable over K. The prototype of an inseparable 


3A proof of the Schroeder-Bernstein Theorem appears in the appendix of Basic Algebra. 


3. Separable and Purely Inseparable Extensions 415 


element is the element a!/? in the extension k(a!/?), where k = F p (a) is a simple 
transcendental extension of the finite field F,. Corollary 9.31 of Basic Algebra 
shows that the separable elements of L over K form a subfield, and L/K is 
defined to be separable if every every member of L is separable over K. As 
a consequence of Corollary 9.29 of Basic Algebra, we know that a separable 
extension of a separable extension is separable. 

One further tool from Basic Algebra is needed in order to handle the failure of 
separability. This is Proposition 9.27, which says that an irreducible polynomial 
f (X) in K[X] is separable if and only if f’(X) is not the zero polynomial. It is 
immediate that every irreducible polynomial is separable if K has characteristic 0. 
Thus we need discuss only characteristic p in the remainder of this section. 

The consequence of Proposition 9.27 for characteristic p is that an irreducible 
polynomial f(X) fails to be separable over K if and only if the only powers of 
X that appear with nonzero coefficient in f(X) are the powers X*?, i.e., if and 
only if f(X) = g(X?) for some g in K[X]. 

In this case the polynomial g(X) is certainly irreducible in K[X], and we can 
repeat this process. The polynomial g(X) fails to be separable over K [X] if and 
only if g(X) = A(X”) for some h in K[X]. Then f(X) = A(X’). Repeating 
this process as many times as possible, we see that to each irreducible polynomial 
J (X) in K[X] correspond a unique nonnegative integer e and a unique separable 
irreducible polynomial g(X) such that f(X) = g(X”’). We call p° the degree of 
inseparability of f(X) over K. From the definitions an element of an algebraic 
extension of K is inseparable if and only if the degree of inseparability of its 
minimal polynomial over K is greater than 1. 

If L/K is an algebraic field extension, then an element w of L is said to be 
purely inseparable* over K if w”” lies in K for some integer jz > 0. Let us see 
in this case that the minimal polynomial of a over K is of the form X” — a? 
for some e > 0. 


Proposition 7.10. If K is a field of characteristic p and if @ is a member of K 
such that 2/o is not in K, then X?" — a is irreducible in K[X] for every yu > 0. 


PRooF. Let L be a splitting field of X”" — a over K. If 6 is aroot of X” —a, 
then B”" = a, and hence X?" —@ = X?" — BP" = (X — B)?P”. 

Let f(X) be a monic irreducible factor of X?" — a in K[X]. Let us see that 
XP" —q@ = f(X)" for some n. In fact, if the contrary were true, then there 
would be a second monic irreducible factor g(X) of X?” — a in K[X] relatively 
prime to f(X). Then we can write u(X) f(X) + v(X)g(X) = 1 for suitable 


4 Warning: Not every element of L that is purely inseparable over K is inseparable over K. The 
elements of K are counterexamples. Corollary 7.12 below shows that the elements of K are the only 
counterexamples. 


416 VII. Infinite Field Extensions 


polynomials u(X) and v(X) in K[X]. As members of L[X], both f(X) and 
g(X) have to be powers of X — 6 by unique factorization, and thus they both 
vanish at 6. Substitution of 6 into uf + vg = 1 therefore yields a contradiction. 
Hence X?" —a = f(X)". 

Since f (X) has to be (X — B)'” for some m, we obtain X?" — a = f(X)" = 
(X — B)’”". The integers m and n must divide p”. Thus m = p”, and f(X) = 
(X —B)?’ = XP’ — BP’. Since f (X) is assumed to be in K[X], B” liesin K. An 
inequality v < y would imply that y = (6 Py)" lies in K; the p™ power of 
y is a, however, and the hypothesis of the proposition says that such an element 
y cannot be in K. We conclude that v = jz, and thus f(X) = X?" —a. In other 
words, X?" — a is irreducible in K[X]. 


Corollary 7.11. If L/K is an algebraic extension in characteristic p, if a is 
a purely inseparable element of L over K, and if e is the smallest nonnegative 
integer such that w”° lies in K, then the minimal polynomial of a over K is 
XP — qe’, 


PROOF. This is immediate from Proposition 7.10. 


Corollary 7.12. If L/K is an algebraic extension in characteristic p and if a 
is an element of L that is separable and purely inseparable over K, then a lies 
in K. 


PROOF. Since @ is purely inseparable over K, Corollary 7.11 says that the 
minimal polynomial of w over K is X”° —a?*, where e is the smallest nonnegative 
integer such that w?* lies in K. The separability of w says that this polynomial 
is separable. Unless p’ = 1, the polynomial has derivative 0 and thus repeated 
roots. Therefore p® = | and e = 0, and we conclude that @ lies in K. 


An algebraic field extension L/K in characteristic p is said to be purely 
inseparable if every element of L is purely inseparable over K. Since purely 
inseparable elements a have minimal polynomials of the form X?° — a", the 
degree of a purely inseparable extension has to be a power of p. 


Theorem 7.13. If L/K is an algebraic field extension in characteristic p and 
if Ks is the subfield of all elements of L that are separable over K, then L/K, is 
a purely inseparable extension. 


PROOF. Let a be an element of L, and let f(X) be the minimal polynomial 
of a over K. Then we can write f(X) = g(X”'), where p® is the degree 
of inseparability of f. The polynomial g(X) is irreducible over K, and it is 
separable. Since w” is a root, aw” is a separable element. Therefore a?’ lies in 
K,. By definition of pure inseparability, w is purely inseparable over K,;. Since 
a is arbitrary in L, L is purely inseparable over Ks. 


3. Separable and Purely Inseparable Extensions 417 


Corollary 7.14. Let R be a Dedekind domain, let F be its field of fractions, 
let K be a finite algebraic extension of F, and let T be the integral closure of R 
in K. Then T is a Dedekind domain. 


REMARKS. This result is quite important. It was used extensively in Chapter VI, 
as was explained in the remarks with Proposition 6.7, and it plays a foundational 
role in the theory of algebraic curves as presented in Chapters IX and X. Theorem 
8.54 of Basic Algebra proved this result under the assumption that K is a finite 
separable extension of F’, and we are now dropping the hypothesis of separability. 
Since K/F is automatically a separable extension in characteristic 0, we may 
assume that the characteristic is not 0. 


PRooF. Theorem 7.13 shows that K can be obtained in two steps from F’, 
a separable extension followed by a purely inseparable extension. The integral 
closure of F in the separable extension field is a Dedekind domain D by Theorem 
8.54 of Basic Algebra, and the integral closure of D in K equals T by the 
transitivity of integral closure. Consequently it is enough to prove the corollary 
under the additional hypothesis that K is a purely inseparable extension of F. 
What needs proof (in view of the statement of Theorem 8.54 of Basic Algebra) 
is that T is Noetherian, 1.e., that each ideal of T is finitely generated. 

Let p be the characteristic. Since K/F is finite and purely inseparable, there 
exists some power g = p™ of p such that the field K% is contained in F; 
specifically, the integer g is to be large enough for the g" power of each element 
of a vector-space basis of K over F to lie in F. We begin by proving that 


T={beK |b eR}. (x) 


The inclusion C follows, since b € T implies that b?’ isin T 1 F = R. For the 
inclusion >, letb # Obein K. Corollary 7.11 shows that the minimal polynomial 
of b over F is X” — b?’, where ¢ is the smallest integer > O such that b?* lies in 
F. Since K?" C F,e < m. Thus b is a root of a polynomial XP?" — a, where 
a = bP” isa member of R. Consequently b is integral over R and must lie in 7. 
This proves (+). 

Fix an algebraic closure Kaig of K, andlet H = F 7” denote the inverse image 
of F under the g™ power isomorphism of Kaig onto itself. This is a subfield of 
Kajg, and it contains K because K? C F. Let S C H be the ring of all b in H 
with bY in R. Since x +> x4 is a field isomorphism of H onto F, x t x4 isa 
ring isomorphism of S onto R. Therefore S is a Dedekind domain. It contains T 
by (). 

Let J be a nonzero ideal in T, and form the ideal J = SJ in S generated by 
I. Since S is Dedekind, J is invertible as a fractional ideal of H relative to S. If 
J~—! denotes the inverse, then J~! is a finitely generated S module in H such that 


418 VII. Infinite Field Extensions 


J-'J = 8S. Thus § = J-'J = J“'SI = JI. Accordingly, choose finite sets 
{x;} in J~! and {a;} in J such that )* x;a; = 1. 

We shall show that {a;} is a set of generators of J as an ideal in T. We 
apply the q"" power mapping to )~ x;a; = 1, obtaining )° xa? = 1 with x7 in 
H¢ = F C K and witha! in S4 = R. Put b; = af 'x!. Then Yo xfaf = 1 
implies that 5° ajb; = 1; here q; is in J and J; is in I7'K CK. Ifaisin/, then 
>= (bja)a; = a, and it is enough to show that b;a is in T for each i, i.e., to show 
that b;J C T for each i. 

The qg-fold product (x;/) - - - (x;1) is contained in S because xj] C J~'J = S. 
Thus b;/ = xia tT CS. SobjI C SOK. Ifs is any element in §  K, then 
we know that r = s? is a member of R because S? = R. Hence s is a root of 
X41 —r withr in R. That is, s is integral over R. Since s also is in K, s lies in the 
integral closure of R in K, whichis T. Thus b;/ C T, and the proof is complete. 


A field K is perfect if either it has characteristic 0 or else it has characteristic 
p and the field map x +> x? of K into itself is onto. Examples of perfect fields 
include all finite fields, all algebraically closed fields, and of course all fields of 
characteristic 0. 


Proposition 7.15. A field K is perfect if and only if every algebraic extension 
of K is separable. 


PROOF. We need to consider only the case that K has characteristic p. Suppose 
that x > x? fails to be onto K. Choose f in K such that X? — 6 has no root 
in K. Proposition 7.10 shows that X? — B is irreducible over K. Since this 
polynomial has derivative 0, it is not separable. Thus X” — £ is a polynomial that 
is irreducible but not separable, and adjunction of a root of X? — B to K produces 
an extension L of K that is not separable. 

Conversely suppose that the field map x +> x? of K to itself is onto. Then 
x t x? is onto K for every e > 0. Let L be an algebraic extension of K, 
and let K, be the subfield of elements separable over K. If @ is given in L, 
then Theorem 7.13 shows that there exists a nonnegative integer e such that a” 
is in Ks. Let g(X) be the minimal polynomial of a?* over K, and write g(X) = 
X”™ + o,X™-14...+ 6. Since K is perfect, there exists b; for each j with 


1 <j <m such that b? = cj. Put f(X) =X" +b) X""! +--+. bm. Then 


fa) =a" +b? (a yn! 4... + bP = g(a?) =0, 


and therefore f(a) = 0. Consequently f(X) divides the minimal polynomial of 
a over K, and the fact that w”° lies in K («) implies that 


[K(q) : K] < deg f (X) = deg g(X) =[K(a”) : K] < [K(@) : K]. 


3. Separable and Purely Inseparable Extensions 419 


Equality must hold throughout, and therefore K (a) = K(a?’). Since K (a”’) is 
contained in K,, a lies in K,. Therefore every member of L lies in K,, and L is 
separable over K. 


A function field in 7 variables over a field K is a field L that is finitely 
generated over K and has transcendence degree r over K. A transcendence basis 
{x1,..., x,} of such an extension L/K is called a separating transcendence 
basis of L/K if L is a separable algebraic extension of K(x,,...,x;). If the 
function field Z in r variables over K has a separating transcendence basis, we 
say that L is separably generated over K. 

The two kinds of fields of continual interest in Chapter VI were number fields 
and function fields in one variable over a base field. In the latter case some results 
beginning in Section VI.6 assumed in effect that the function field is separably 
generated over the base field. It was asserted at the beginning of Section VI.9 that 
function fields in one variable over finite fields are always separably generated; 
this assertion is a special case of Theorem 7.20 below. 

Proposition 4.28 of Basic Algebra gave a version of the Factor Theorem valid 
for all commutative rings with identity. For the present investigation we need a 
version of the division algorithm that is valid in this wider context. 


Lemma 7.16. Let R be a commutative ring with identity, let f(X) and g(X) 
be members of R[X] of respective degrees m and n, and let a be the leading 
coefficient of g(X). For the integer k = max(m —n +1, 0), there exist g(X) and 
r(X) in R[X] such that 


ak f (X) = g(X)q(X) +r(X) with degr <norr =0. 


PRrooF. Ifm <n, then k = 0, and the displayed formula holds with g(X) = 0 
and r(X) = f(X). Form > n — 1, we proceed by induction on m. The base 
case of the induction is m = n — 1, which we have already handled. For the 
inductive step, suppose that m > n. The integer k ism —n-+1. If b is the leading 
coefficient of f(X), then af (X) — bX”~"g(X) is a polynomial that either is 0 
or has degree less than m. The inductive hypothesis allows us to write 


al —V—"* 1 af (X) — bX"™"g(X)) = g(X)qi(X) +1ri(X) 


with degr; <n orr,; = 0. If we set g(X) = ba” "XX" + qy(X) and r(X) = 
r1(X), then we obtain aé f (X) = g(X)q(X) + r(X), and the lemma follows. 


Lemma 7.17. Let L/K be a field extension, let x1, ..., X), X41 be elements 
of L, and suppose that x1,...,x, are algebraically independent over K but 
that x1, ...,Xn,Xn+41 are not algebraically independent. Then the ideal J of all 
polynomials in K[X1, ..., Xn+1] that vanish at (11, ..., Xn41) is principal with a 
generator that is irreducible in K[X1,..., Xn41] and involves X,4+1 nontrivially. 


420 VII. Infinite Field Extensions 


PROOF. The algebraic dependence implies that J contains nonzero polyno- 
mials. Let g(X,,..., Xn, Xn41) be one whose degree in X,4; is as small as 
possible, say /. Expand g as 


8 = COCK, Xa) Xb Fer... Xe) Keg Fo teas. Xn): 
The algebraic independence of X1,..., X, implies that at least one of co, ..., C71 
is nonzero. Since K[X,,..., X,] is a unique factorization domain, we can factor 
out and discard the greatest common divisor of the coefficients co, ..., c;. Thus 
we may assume that g is primitive as a polynomial in X,,4,. If f is any element 
in 7, then Lemma 7.16 applied to the ring K[X,,..., X,] allows us to write 
a‘ f = gq +r withr =O0or degr < k. Substituting (x1, ..., %,+41), we see that 
r isin J. The minimality of / implies that r = 0, and thus a‘ f = gq. Write c(h) 
for the greatest common divisor of the coefficients of a polynomial h. Taking 
the greatest common divisor of the coefficients on each side of a‘ f = gq and 
applying Gauss’s Lemma, we obtain a‘c(f) = c(q). Therefore a* divides q, 
and we obtain f = gqo for some go. Consequently / is principal. If g = gg, 
then the definition of J shows that at least one of g; and gz is in J, say g;. The 
minimality of / implies that the degree of g; in X,+; is /. Therefore go is in 
K[X,,..., Xn]. Since g is primitive, go divides 1. Hence go lies in K. 


Theorem 7.18 (Mac Lane). If L/K isa field extension that is finitely generated 
and separably generated, then any set of generators contains a subset that is a 
separating transcendence basis of L/K. 


PROOF. Let the characteristic be p. The proof is by induction on the tran- 
scendence degree of the extension. For transcendence degree 0, the required set 
is the empty set, and there is nothing to prove. The main step is transcendence 
degree 1. 

Thus let L = K(x1,..., Xn), and suppose that {z} is a transcendence basis of 
L over K such that L is separable over K (z). Since z is transcendental, z does not 
lie in K (z?). Thus Proposition 7.10 shows that X? — z? is irreducible over K (z?), 
and z is inseparable over K(z?). The field L is algebraic over K(z?), and the 
subset of separable elements over K (z”) is a subfield. Since L = K(x,,..., Xn) 
and since z is amember of L that is not separable over K (z?), it follows that some 
Xj, Say X1, is inseparable over K(z”). It will be proved that {x,} is a separating 
transcendence basis of L over K, i.e., that x; is transcendental over K and that L 
is separable algebraic over K (x1). 

We apply Lemma 7.17 with n = 2 to the elements z, x;. The lemma pro- 
duces an irreducible polynomial f(Z, X) in K[Z, X] such that f(z,x1) = 0. 
Gauss’s Lemma shows that this polynomial remains irreducible when considered 
in K(Z)[X], and we have a ring isomorphism K (Z)[X] = K (z)[X] because z is 


3. Separable and Purely Inseparable Extensions 421 


transcendental over K. Up to anonzero factor from K (z), f(z, X) is the minimal 
polynomial of x; over K(z). Since L is separable over K (z), the element x, is 
separable over K (z), and its minimal polynomial over K (z) involves some power 
of X that is not a power of X?. 

Let us prove that x; is transcendental over K. In the contrary case, let g(X) 
be its minimal polynomial over K. Since g vanishes when X = x; and Z = z, 
g(X) satisfies an identity g(X) = q(Z, X)f(Z, X) in K[Z, X]. It therefore 
satisfies the same identity in K (X)[Z]. Since g(X) is a unit in K (X)[Z], so is 
f(Z, X). Therefore f(Z, X) is independent of Z. Since g(X) is the minimal 
polynomial for x; over K, g(X) = cf(Z, X) for some c in K. Since f(Z, X) 
involves a power of X that is not a power of X?, the same thing is true of ¢(X), 
and consequently x is separable over K. Therefore x, is separable over the larger 
field K (z?), in contradiction to the defining condition on x;. We conclude that 
x1 is transcendental over K. 

Since L has transcendence degree | over K, it follows that z is algebraic over 
K (x,). Let us see that z is separable over K (x,). In fact, Gauss’s Lemma shows 
that f(Z, X) remains irreducible when considered in K (X)[Z], and we have a 
ring isomorphism K (X)[Z] = K(x,)[Z] because x, is transcendental over K. 
Therefore f (Z, x1) is the product of a nonzero member of K (x1) and the minimal 
polynomial m(Z) of z over K (x1). If z were inseparable over K (x1), then m(Z) 
would be a polynomial in Z?, and we would have f(Z, X) = h(Z?, X) with 
hin K[Z, X]. We know that f(Z, X) involves some power of X that is not a 
power of X?, and hence the same thing is true of h(Z?, X). Since h(z?, X) is 
irreducible in K [X], x; is separable over K (z”), in contradiction to the defining 
property of x;. Therefore z is separable over K (x). 

The defining property of z is that all x; are separable over K(z). Since z is 
separable over K (x,), all of x2,...,X, are separable over K (x,). Therefore L 
is separable over x;, and {x;} is a separable transcendence basis of L/K. This 
completes the proof of the theorem for transcendence degree 1. 

The inductive step is somewhat a formal consequence of what has just been 
proved. To see this, suppose that the theorem is known for transcendence de- 
grees 1 andr — 1, and let L = K(x1,...,X,) have transcendence degree r. 
The assumption is that L has a transcendence basis {z1,...,z-} such that L 
is separable over K(z,,...,Z,). Put K,; = K(z,). Then the set {z2,..., z,} 
is a transcendence basis of L over K, consisting of r — 1 elements, and L is 
separable over K,(Z2,..., 2) = K(z1,...,Z,) by assumption. By the inductive 
hypothesis for the case of transcendence degree r — 1, some subset of r — 1 
elements from among x),..., x, forms a separating transcendence basis of L 
over K,; let us say that this basis is {x1,...,x--1}. This implies that L is 
separable over Ki(™1,...,%,-1) = K(Z1,%1,...,%--1). In other words, if 
K' = K(x,...,X,-1), then L = K’(x;, ..., Xn) is separable over K'(z1). Since 


422 VII. Infinite Field Extensions 


L/K’ has transcendence degree 1, {z} is a separating transcendence basis of L/K’. 
By the inductive hypothesis for transcendence degree 1, some x; forr < j <n 
forms a separating transcendence basis of L/K’. For this j, {x1,..., X;—1, xj} is 
then a separating transcendence basis of L/K. 


Lemma 7.19. Suppose that L is a field extension of transcendence degree r 


over a field K and that L is not separably generated over K. If x1,..., x, are 
elements of L such that L = K(x1,..., Xn), then for a suitable relabeling of the 
x;’s, the subfield K (x1,...,%,41) of L is of transcendence degree r and is not 


separably generated over K. 


PROOF. We fix K andr, and we proceed by induction on n. The base case is 
that n =r +1, and then there is nothing to prove. For the inductive step, suppose 
that the lemma has been proved for n — 1 whenn > r + 1. We prove the lemma 
for n. Since r < n, we can renumber the x;’s and assume that K (x2,..., Xn) 
has transcendence degree r over K. If this field is not separably generated over 
K, then we are in a situation with n — 1 elements. The inductive hypothesis is 
applicable, and the lemma follows in this case. 

Thus suppose that K (x2, ..., X,) is separably generated over K. Theorem 7.18 
shows that after a renumbering of the indices, we may assume that {x2,..., x-41} 
is a separating transcendence basis of K (x2,...,X,) over K. This implies that 
K (Xo, ..., Xn) is a separable extension of K (x2, ..., X-41). Since by assumption 
L = K(x1,...,Xn) is not separably generated over K, K(x1,...,%X,) is not 
separable over K (x2, ..., X41). A separable extension of a separable extension is 
separable, and we deduce that K (x1, ..., X,) is not separable over K (x2, ..., Xn). 
Thus x; is inseparable over K (x2, ..., Xn) and is consequently inseparable over 
the subfield K (x2, ..., X;41). Hence K (x1, ..., x,+1) is not separably generated 
over K. 


Theorem 7.20 (F. K. Schmidt). If K is a perfect field, then every finitely 
generated field extension of K is separably generated over K. 


REMARK. In particular, the theorem applies if K is a finite field or is alge- 
braically closed or has characteristic 0. 


PROOF. Let K have characteristic p. We induct on the transcendence degree 
of the field extension of K. The base case of the induction is transcendence 
degree 0, and then the theorem is handled by Proposition 7.15. For the inductive 
step, assume that the theorem holds for all finitely generated field extensions of 
K having transcendence degree r — 1 over K. Let L = K(x,...,%Xn) have 
transcendence degree r over K. Arguing by contradiction, suppose that L is not 
separably generated over K. Lemma 7.19 shows for a suitable renumbering of the 


4. Krull Dimension 423 


x;’s that K’ = K (x,,..., X;41) has transcendence degree r and is not separably 
generated over K. 

We divide matters into two cases. First suppose that the transcendence degree 
of K” = K(x,...,x,) isr — 1. The inductive hypothesis shows that K” is 
separably generated over K , and then Theorem 7.18 shows that we may renumber 
the variables in such a way that {x), ..., x,_;} is a transcendence basis of K” over 
K and K” is separable algebraic over K (x, ..., X--1). Then {x1,..., Xp-1, Xr+1} 
is a transcendence basis of K’, and x, is algebraic over K (x1, ..., Xp—1, Xr+1)- 
Since x; 1s separable over K (x1, ...,X;—1), it is separable over the larger field 
K(x1,..-,Xr—-1, +41). Therefore K’ is separably generated over K, contradic- 
tion. 

The remaining case is that every subset of r members of {x1,..., X-41} is a 
transcendence basis of K’ over K. Lemma 7.17 produces an irreducible polyno- 
mial f in K[X,..., X-4,] such that f(x,,...,x-41) = 0. Since {x,,..., x;} 
is a transcendence basis of K’, application of Gauss’s Lemma shows that f is 
irreducible in K(X,,..., X;)[X+41] = K(,...,%,)[X-41]. Hence up to a 
nonzero factor from K, f(x,,...,X;, X-41) is the minimal polynomial of x,4+ 
over K(x,,...,X,). The failure of K’ to be separably generated over K implies 
that x,+1 is inseparable over K (x1, ..., x,), and thus the only powers of X;+, that 
appear in its minimal polynomial over K (x1,...,x,) are powers X a In other 
words, f is in K[Xj,..., X;, pce Since we are assuming that any r of the 
elements x), ..., X41 form a transcendence basis of K’ over K, there is nothing 
special about X,+1 in this argument. Consequently f isin K[X Pe re ae 4 44]. 
Since K is perfect, any polynomial involving only p™ powers of each indeter- 
minate is the p power of some polynomial. Consequently f is reducible in 
K[X,,..., X;41], in contradiction to the irreducibility guaranteed by Lemma 
7.17. All cases thus lead to a contradiction, and the proof is complete. 


4. Krull Dimension 


In this section we develop the algebraic background necessary for a discussion 
of dimension. Suppose that K is an algebraically closed field, suppose that J is 
a prime ideal in K[X,,..., X,], and suppose that V(/) is the locus of common 
zeros of I. Corollary 7.2 shows that J is the set of all polynomials vanishing on 
V (J), and thus the integral domain R = K[X,,..., X,]/J may be regarded as 
the set of all restrictions to V(/) of polynomials. If L is the field of fractions 
of R, then the transcendence degree of L/K is interpreted as the “number of 
independent variables” on the locus V(/). We define it to be the dimension of 
V(Z). The elements X; + J of R for 1 < j <n generate R as a K algebra, 
and therefore they generate L over K as a field. We shall make critical use of 


424 VII. Infinite Field Extensions 


the fact implied by Lemma 7.6b that some subset of {X; + /,..., X, + J} isa 
transcendence basis of L. We shall speak of such a subset as a transcendence 
basis of R for economy of words. We denote its cardinality by tr. deg R. 


EXAMPLE. We continue with the example from Sections 1-2. Let K = C, 
let I be the principal ideal (Y* — X(X + 1)(X — 1) in C[X, Y], and let L 
be the field of fractions of the integral domain R = C[X,Y]//. Corollary 
7.2 shows that the ring R is the ring of restrictions of polynomials to the locus 
VD) = {(x, y) € C? | y? =x(x4+D(x—-D)}. According to the above definition, 
the dimension of V (/) is the transcendence degree of L, which we have seen is 1. 
This is in accord with the intuition that the locus V(/) is a “curve” in the sense 
of having one independent complex parameter. 


The goal of this section is to produce an equivalent definition of dimension 
that does not depend on the fact that K[X1,..., X,]// is an integral domain. 
The rephrased definition will extend to any commutative ring with identity and 
is essential for modern algebraic geometry. 

Let R be any commutative ring with identity. The Krull dimension of R, 
denoted by dim R, is the supremum of the indices d of all strictly increasing 
chains 

PoSPiG-.+S Py 


of prime ideals in R. We define dim R = oo if there is no finite supremum. 


EXAMPLES OF KRULL DIMENSION. 


(1) R equal to a field. The only prime ideal is 0. Thus the Krull dimension of 
any field is 0. 


(2) R = Z. The prime ideals are of the form pZ for each prime number p, 
together with 0. Each nonzero prime ideal is maximal. Consequently there is a 
strictly increasing chain 0 S pZ of prime ideals for each prime number p, but 
there are no longer such chains. Thus dim Z = 1. More generally any principal 
ideal domain R that is not a field, or even any Dedekind domain R that is not a 
field, has dim R = 1 because every nonzero prime ideal is maximal. 


(3) R commutative Artinian. In Chapter II a ring with identity was defined to be 
Artinian if its two-sided ideals satisfy the descending chain condition. Problem 8 
at the end of that chapter showed that every prime ideal in such a ring is maximal. 
In other words, every commutative Artinian ring has Krull dimension 0. 


(4) Polynomial ring R = K[X,,..., X,], where K is a field. In geometric 
terms for the case that K is algebraically closed, the relevant zero locus for this 
R is K", which we certainly want to have dimension equal to n, and the field of 
fractions of R is K(X1,..., Xn), which indeed has transcendence degree n. Let 


4. Krull Dimension 425 


us examine the Krull dimension of R. If 0 < k < n and if we form the ideal 
(X,,..., Xx), then the ring isomorphism 


R = K[Xx41,---5 Xn][X1, oe, XK] 
shows that the quotient R/(X,,..., X%) is isomorphic to K[Xz41,..., Xn], 


which is an integral domain. Therefore (X;,..., Xx) is prime, and we have 
a strictly increasing chain 


OS OO) SS Cite Ga) S Ane sas 


So dim K[X1,..., Xn] = n. Actually, equality holds, as Theorem 7.22 will 
show. 


Lemma 7.21. Let R be a commutative ring with identity, let S~'R be the 
localization relative to a multiplicative system S$ in R, let J be an ideal in R, and 
let S be the image of S in R/J. Then 


STR/S'1=S7"(R/1) 


via the mapping s'r + STI (+11 +2. 


Proor. Let g : R > R/I andg : S~'R > S~!R/S7'I be the quotient 
homomorphisms, and let 7 : R > S~'R and : R/I > S~'(R/I) be the 
canonical homomorphisms of R and R/J into their localizations. To each of the 
rings X; = S*R/SUI and X, = S~!(R/1) is associated a canonical map, 
namely 7, : R — X, andy. : R > Xp» with n, = qn and no = nq. Let 
us see that the pairs (X;, 7;) fori = 1, 2 have the following universal mapping 
property with respect to ring homomorphisms g of R into a commutative ring 
T with identity such that g(1) = 1, pV) = 0, and g(S) C T%: there exists a 
unique homomorphism @; : X; — T such that g = @;7;. 

For i = 1, we first apply the universal mapping property of the localization 
S~!R to write g¢ = gn and then apply the universal mapping property of the 
quotient to write g = G\qgn. For i = 2, we first apply the universal mapping 
property of the quotient R/J to write g = @q and then apply the universal map- 
ping property of the localization to write ¢ = @)7nq. From these constructions we 
deduce existence and uniqueness of ¢; in both cases. The asserted isomorphism 
then follows from the general fact that objects satisfying a universal mapping 
property are unique up to isomorphism; tracking down that isomorphism gives 
the explicit formula in the lemma. 


426 VII. Infinite Field Extensions 


Theorem 7.22. Let K be a field, let R be an integral domain that is finitely 
generated as a K algebra, and let L be the field of fractions of R. Then the Krull 
dimension of R equals the transcendence degree of L over K. 


~w 


PROOF. If x,,...,%, are generators of R as a K algebra, then R = 
K[X1,..., Xn]/I, where J is the ideal of all polynomials in K[X1,..., Xn] 
that vanish at (x1,...,%,). The ideal J is prime, since R is assumed to be an 
integral domain. Let r be the transcendence degree of L over K. We know from 
Lemma 7.6b that some subset of {x,, ..., x,} is a transcendence basis of L over 
K; therefore r < n. To prove the theorem, we shall prove that r > dim R and 
thatr < dim R. 

Suppose that P and Q are prime ideals of R with P C Q. Then the identity 
map on R descends to a K algebra homomorphism g : R/P — R/Q. If 
a; =x; +P and fj = x; + Q are the images of x; under the respective quotient 
maps R — R/P and R > R/Q, then {q1,..., a,} is aset of generators of R/P, 
{B1,..., Bn} isaset of generators of R/Q, and y(a;) = fj forl < j <n. Ifr’ = 
tr. deg R/Q, we may assume that {61,..., 6,’} is a transcendence basis of R/Q. 
Then {a, ..., a;’} is an algebraically independent subset of R/P over K because 
if f is a nonzero polynomial in K[X,,..., X,/] such that f(q@,,...,a,) = 0, 
then application of g and use of the fact that g fixes each coefficient of f yields 
St (Bi, .--, By) = 0; the latter equation contradicts the algebraic independence of 
{B61,..-, By}. We conclude that 


PC@Q implies tr.deg(R/P) => tr.deg(R/Q). (*) 
To prove the inequality r > dim R, let a chain of prime ideals 
OSPR SP Se Sh, 


of R be given. We are to show thatr > d. Abbreviate K[X1,..., Xn] as A, so 
that R = A/T. Pull the chain of ideals of R back to a chain of ideals in A as 


POPS Py So GPs (2) 

Inequality () shows that 
tr. deg(A/P)) = tr.deg(A/P}) > --- > tr.deg(A/P)). (+) 
Since taking Pj = J shows that tr. deg(A//) = tr. deg(R) =r, every member of 
(+) is < r. It will follow from (+) that r > d if we show that each inequality in 


(+) is strict, i.e., that for prime ideals P and Q in A, 


P - Q implies tr.deg(A/P) > tr.deg(A/Q). (+7) 


4. Krull Dimension 427 


Since dim R is the supremum of the integers d as in (**) and (+), proving (+7) 
will prove that r > dim R. 

Thus let P and Q be prime ideals in A = K[X,,..., X,] with P c Q. Put 
a; = X; + P and fj = X; + Q, so that the mappings of A to A/P and A/Q 
are f(X1,...,Xn) t f(@i,...,@,) and f(X1,...,Xn) H f(Bi,..., Bn). 
Then A/P = K[a,...,a@,] and A/O = K[f),..., Bn]. As above, ifr’ = 
tr.deg A/Q, then we may assume that {61,..., 8,’} is a transcendence basis of 
A/Q. Arguing by contradiction, we may assume that tr. deg A/ P = tr. deg A/Q. 
Then it follows that {a, ..., a} is a transcendence basis of A/P. We localize A 
with respect to the multiplicative system S consisting of the complement of 0 in 
K[X,,..., X»]. Then S“'A = K(X}, ..., Xp )[Xp41,---, Xn]. To understand 
S~! P, we apply Lemma 7.21 to write 


Slay pS 57 (Aypy; (4) 


where S is the image of S in A/P. The restriction to K[X1,..., X,’] of the 
map A > A/P carries f(X,,..., X,/) to f(Q1, ..., a) and is one-one because 
{a,,...,@,/} is a transcendence set. Therefore SQ P = @, and S > S is 
one-one. Corollary 8.48d of Basic Algebra shows from SM P = @ that S~'P 
is a proper ideal of S~'A. Since S — S is one-one, let us view S as S = 


{f(ai,..., a") | f #0}. Then 
S'(A/P) = K(qy,..., [Opry 1, -. +5 On]. (th) 


Since 41, ..., @, are algebraic over K (a1, ..., @,") because of the assumption 
tr.deg A/P = tr.deg A/Q =r’, the remark with Lemma 7.3 shows that (£4) is 
a field. By ({), S~! P is a maximal ideal. Arguing similarly with Q, we see that 
SA QO = @ and that S~'Q is a maximal ideal. From P C Q, we have S~!P C 
S-!Q. Because S~!P and S~'@Q are maximal, S~'P = S~'Q. Therefore Q C 
S~'P. Since Q properly contains P, we can choose g in Q that is not in P. This g 
isanelementof K[X,,..., X,]suchthat g(a,...,@,) # Oandg(fj,..., Bn) = 
0. From the inclusion Q C S7!P, there exist an f in P and a nonzero s in 
K[X,,..., X,] with g = s~'f. Then f = sg. Since f(aj,...,a@,) = 0 and 
S(Q1,...,Q")g(Q1,...,Q@_,) # 0, we obtain a contradiction. This contradiction 
proves (++) and shows that r > dim R. 

The argument that r < dim R will proceed by induction on r. If r = 0, then 
R= K[x,...,X,] 1s a field by the remark with Lemma 7.3, and dim R = 0 by 
Example 1 of Krull dimension. Now suppose inductively that r > 0 and that the 
inequality is known when tr.deg R < r. Put A= K[X,,..., X,], and suppose 
that R = A/I = K[x,...,X,] with x; transcendental over K. We localize A 
with respect to the multiplicative system S$ consisting of the complement of 0 


428 VII. Infinite Field Extensions 


in K[X,]. Then S~!'A = K(X,)[X>,..., X,]. To understand S~'J, we apply 
Lemma 7.21 to write a 
STA/STI=S (A/D, 


where S is the image of S in A/J. Arguing as in the previous paragraph, we see 
that _ 
S~'A/D = Kx), «Xn. 


Combining these two isomorphisms, we see that S~! A/S~'J has transcendence 
degree r — 1 over K(x,). By the inductive hypothesis, S~'A/S~'J has Krull 
dimension > r — 1. Thus there exists a strictly increasing chain 


S"1=Q905+:-SQ,-1 


of prime ideals in S -lA. If we put P; = AN Q; for eachi, then each P; is prime 
in A. From the theory of localization, we know that Q; is recovered from P; by 
QO; = S~'P;, and thus we have a strictly increasing chain 


Peo hi Sus Pej (§) 


of prime ideals in A. The fact that P,_; is proper implies that SN P,_; = @. That 
is, no nonzero member of K[X,] lies in P,_;. Consequently the image of X, 
in A/P,_; 1s transcendental over K. The Nullstellensatz (Theorem 7.1) shows 
that P-_; is not maximal in A. Hence the chain (§) can be extended by a strict 
inclusion in a maximal ideal P,, andr < dim A/J = dim R. This completes the 
induction and the proof. 


5. Nonsingular and Singular Points 


In this section we develop the initial algebraic background necessary for a dis- 
cussion of nonsingular and singular points. Unlike what happened in previous 
sections, we shall not try to separate completely the algebra from the geometric 
setting, because the points to be investigated are the actual points of a zero locus. 

The motivation comes from the Implicit Function Theorem in the calculus of 
several variables. In that setting, suppose that we have / numerical-valued smooth 
functions f|,..., f; of n variables. Let k be an integer with 1 < k <n, and ab- 
breviate (x1, ...,X,) aS (x, y), where x = (x1,..., x%) and y = (%q41,..-, Xp). 
Suppose that (xo, yo) has the property that f; (xo, yo) = Ofor1 <i < J. The hope 
is that there is a smooth vector-valued function y = g(x) defined near x = xo 
such that yo = g(xo) and such that f;(x, y) = 0 for 1 <i </ with (x, y) near 
(xo, yo) if and only if y = g(x), Le., that the locus of common zeros of fi, ..., ff 
is locally the graph of a smooth function of k variables. According to the Implicit 


5. Nonsingular and Singular Points 429 


Function Theorem, a sufficient condition for this to happen is that k + / = n and 
that the (square) matrix of the first partial derivatives at (xo, yo) of the f;’s for 
1 <i </ with respect to the y;’s fork + 1 < j <n be invertible. A little more 
generally but still with k +/ =n, the locus of common zeros is locally the graph 
of a smooth function of / of the variables in terms of the remaining k variables if 
the matrix of all the first partial derivatives of the f;’s has the maximum possible 
rank, namely /. 

Let us describe the setting for a comparable situation in algebraic geome- 
try. Throughout this section we assume that K is an algebraically closed field. 
Suppose that J is a prime ideal in K[X,,..., X,,], and let V(/) be the locus of 
common zeros? of J in K". The Hilbert Basis Theorem shows that J is finitely 
generated over K as an ideal, and we let {f|,..., {7} be a set of generators. 
Corollary 7.2 shows that / is the set of all polynomials vanishing on V (J), and 
thus the integral domain R = K[X1,..., X,]/I may be regarded as the set of all 
restrictions to V(/) of polynomials in the following sense: if x = (x1, ..., Xn) is 
a member of V(/) and f(X1,..., Xn) isin K[X1,..., Xp], then every member 
of the coset f + / has the same value at x, and it is consequently meaningful to 
write f(x) for f in R. 

From Theorem 7.22 the transcendence degree over K of the field of fractions 
of R equals the Krull dimension of the ring R, and these numbers are what is 
taken as the dimension of V(/) over K. We write dim V (/) for this dimension. 
In this setting, a point x of V(/) is called a nonsingular point, or regular point, 
if the matrix [ax (x)| has rank equal to n — dim V (/). Otherwise x is a singular 
point. 

It is important to observe that these definitions do not depend on the choice of 


the set {f;,..., fz} of generators of J. In fact, it is enough to show that the row 
space of the matrix [#2 (x)] is exactly the space of all row vectors 
J 
of of 
oe A fe eT, 
(ax, a) tors 


since the latter space is manifestly independent of the choice of generators. To see 
that the displayed space equals the row space of the matrix whose rank appears 
in the definition of singular point, let g1,..., g, be arbitrary polynomials. Then 
f = >; gi fj is the most general member of J. Use of the product rule and the 
fact that f;(x) = O for each i shows that AE (x) = ire (x) 22). Since the 
g; are arbitrary, we can arrange for (g(x), be » &n(x)) to be any given member 
of K”. Thus the space of all row vectors ( ae (x) tee 7 (x)) for f € J is 
the set of all K linear combinations of row vectors ( F(x) tee aa (x) ) for 
1 <i </, as asserted. 


5In terminology to be used in later chapters, one says that V (/) is the affine variety corresponding 
to I. 


430 VII. Infinite Field Extensions 


EXAMPLES. 


(1) Irreducible affine curve’ in K*. Suppose that n = 2 in the notation 
above and that J is nonzero and is generated by a single nonconstant polynomial 
i (X,Y). The condition that J be prime is exactly the condition that f(X, Y) 
be a prime polynomial. In turn, since K[X, Y] is a unique factorization domain, 
the condition that f(X, Y) be prime is exactly the condition that f(X, Y) be 
irreducible. Let us specialize to a case for which the first partial derivatives take 
an especially simple form: suppose that 


f (X,Y) =Y? —h(X). 


The only possible factorization is f(X, Y) = (Y + Vh(X) )(Y — Vh(X)), and 
thus f(X, Y) is irreducible in K[X, Y] if h(X) is not the square of a member 
of K[X]. The relevant integral domain is R = K[X, Y]/(f (X, Y)), and we let 
x= X+(f(X, Y)) and y = Y + (f(X, Y)). Then x is transcendental over 
K, and the equation y? = h(x) shows that y is algebraic over K(x). Hence 
tr.deg R = 1, and the corresponding V(/) has dimV(/) = 1. If (xo, yo) is a 
point of V (J), then the matrix of first partial derivatives is 


of of 
Fass cea =(—-h'(X) 2Y eat 
( ox oY Via ( x oo) 


The rank of this matrix is < 1, and nonsingularity of (xo, yo) means that the 
matrix has rank equal to 1. If the characteristic is 4 2, then the condition for a 
singularity is that yo = h(xo), yo = 0, and h’(xp) = O simultaneously. Hence 
V (J) is everywhere nonsingular’ if and only if h has no multiple roots in K. 


(2) Irreducible affine hypersurface’ in K”. For general n, again suppose that 
I is a prime ideal generated by a single nonconstant polynomial f(X,,..., Xn). 
The condition on f for J to be prime is that f be irreducible in K[X,,..., Xn]. 
The relevant ring is R = K[X,,..., Xn]/(f(X1,..., Xn)), and the image in R 
of a polynomial g(X),..., X,) is 0 only if g is divisible by f, by Corollary 
7.2. The polynomial f is nonconstant in some X;, say for j = n. Then 
no nonzero polynomial g(X1,..., Xn-1) maps to 0 in R. Consequently the 
elements x; = X; + (f(X1,..., Xn)) have the property that {x1,..., %,—1} 1s 
a transcendence set in R. The equation f(x,,...,%Xn,) = O shows that x, is 
algebraic over K (X,,..., Xn-1). Hence the corresponding V (/) has dim V (J) = 
tr.deg R = n — 1. The nonsingular points of V(/) are the points of V(/) for 
which some first partial derivative of f is nonzero. 


Some authors include irreducibility in the definition of “affine curve.” This book does not. 

TIf K has characteristic 2 and if xg has the property that h’(xo) = 0, then we can choose yo with 
yo = h(x) because K is algebraically closed, and (xq, yo) will be a singular point. Hence V (J) is 
everywhere nonsingular if and only if h has degree exactly 1. 

8Some authors include irreducibility in the definition of “affine hypersurface.” This book does 
not. 


5. Nonsingular and Singular Points 431 


Theorem 7.23 (Zariski’s Theorem). With K algebraically closed, let J be a 
prime ideal in K[X,,..., X,], let R = K[X,,..., X,]//, and let V(/) be the 
locus of common zeros of J in K”. If x = (x1,..., X,) is a point of V(/), define 
m, to be the maximal ideal 


m, ={f € R| f(x) =0} 


of R, let R,. be the localization of R with respect to m,, and let M,, be the maximal 
ideal of R,. Then 


dimx (M,/M?) = dimg(m,/m?2) > dim V(J), 


and x is nonsingular if and only if equality holds. The set of nonsingular points 
of V (J) is nonempty. 


REMARKS. We are going to prove for each point x of V(/) that 
dimx (M,/M2) = dimg (m,/m?) 


and that 


dim x (m,/m?) + rank bal =n, 


where { fj} is a finite set of generators of J. Since by definition x is nonsingular if 
and only if rank [ae] =n—dim V (J), it will follow that x is a nonsingular point 
if and only if dim (m,/ m2) = dim V(/). Only for the special case that V (/) is an 
irreducible affine hypersurface do we prove that the inequality dimg (m,/m2) > 
dim V (J) always holds for all x and that equality always holds for some x. The 
general case will ultimately be reduced to the special case; we return to this matter 
in Chapter X. The partial proof that we give in the present section will be preceded 
by an example. 


EXAMPLE 1, CONTINUED. Suppose that an affine variety V in K? is obtained 
from the irreducible polynomial f(X, Y) = Y 2 _ h(X). Let us assume that K 
has characteristic 4 2 and that (0,0) lies in V. The latter condition means that 
h(O) = 0. Letx = X + (f(X,Y)) and y = ¥ + (f(X,Y)). Since y* = h(x), 
any polynomial in (x, y) can be rewritten in such a way that the only powers of 
y that occur are 0 and 1. Thus R = {p(x) + yq(x) | p € K[x], q € K[x]}, and 


mo,o = {xp@®) + yg) | p € KIx], g € KLx]}. 


The ideal m2.) consists of all sums of products of two elements of this kind. 
From two polynomials xp(x), we can get any polynomial x°a(x); from xp(x) 


432 VII. Infinite Field Extensions 


and yq(x), we can get any x yb(x); and from two polynomials yg (x), we can get 
any y?c(x) = h(x)c(x). Thus 


M2.) = {x7a(x) + h(x)c(x) + yxb(x)}. 


What happens depends on the first-degree term in h(x). Examining the possibil- 
ities, we see that 


we a Leaman if h'(0) £0, 
oo {x2a(x) + yxb(x)} if h’(0) = 0. 


Hence 


ee if h’(0) £0, 
M0,0)/Mo,0) = 


Kx+ KY if h’(0) = 0. 
In other words, dimx mM@,0)/ m3.) equals 1 if (0, 0) is nonsingular and equals 2 if 


(0, 0) is singular. Since dim V (/) = 1, this result is consistent with the statement 
of Theorem 7.23. 


PARTIAL PROOF OF THEOREM 7.23. As mentioned in the remarks, one thing 
that we are going to prove for each point x of V(/) is that 


dim x (m,/m?) + rank bal =n, (x) 


where {f,..., fz} is a finite set of generators of /. 
Let 1, be the pullback to K[X,,..., X,] of the ideal m,, i.e., let 


Ralf lft em) = 1h €kikises Kal | fae) =O}. 


The K linear mapping f + f + / carries J, onto m,; composing with the 
quotient mapping m, — m,/m2 gives a K linear mapping of J, onto m,/m?. 
If f maps under ¢ to the 0 coset, then f + J = ij (gj + 1)(h;j + 5) for suitable 
polynomials g; and h; with gj + J andh; +/ inm,. Then f — >); gjh; lies in J, 
and f is exhibited as a member of / ey + I. Conversely g does carry I 2 and J to 
the 0 coset. Thus the kernel of ¢ is exactly /? + J, and y descends to a K linear 
isomorphism J, /(1? + 1) = m,/mz. Therefore 


dimg (I, /(¢ + 1) = dimg (m,/m?). (#) 


We define a K linear map 6 of K[X,,..., X,] to the space M,,,(K) of all 
n-dimensional row vectors over K by 


af) =(F-@) --- FG). 


5. Nonsingular and Singular Points 433 


The product rule for differentiation shows that 6 I 2) = 0. The ideal J, considered 
as a K vector space, is spanned by /? and the various polynomials X; —x;. Since 
0(X; — x;) is the j™ standard basis vector of M,,(K), the vectors O(X; — x;) 
form a basis of Mi,(K). Therefore 6 descends to a K linear isomorphism 
6: 1,/12 > Min(K). 

We observed just before Examples | and 2 that the vector space of all row 
vectors 6(f) for f € I equals the row space for the matrix [ax |: Hence 


i = fi 
dimx 0(/) = rank Lax]: 
Since 07) = 6(U + Ta) and since @ is one-one, this equality shows that 
‘ 2) 772) _ Of, 

dimg (U + Ip)/I,) = rank [ xy]. (+) 

Adding («*) and (+) gives 
dim x (1,/I;) = dimg (m,/m2) + rank [ 2]. 
J: 
Since, as we have seen, [,/I Z is isomorphic to M,,(K) via 0, the left side is n, 
and (+) is proved. 
The second thing that we are going to prove now is that 

dimg (m,/m;) = dimg (My/M_). (+t) 
If L is the field of fractions of the integral domain R, then the localization R, is the 
subring of L of all quotients g/h with g andh in R and h(x) 4 0. The inclusion 
m, C M, induces a K linear ring homomorphism @ : m, /m —> M,/ MM, and 
(++) will follow if g is shown to be one-one onto. 


If g/h is given in M, with g € m, and withh € R having h(x) ¥ 0, then the 
decomposition 


a oh 
heya = + (QC) 
exhibits h(x)~!g in m, as mapping to g/h + M?. Therefore ¢ is onto. 


If g in m, maps to >>; ( £) ( a) in M2, then we can clear fractions and write 


hg = >; gig;h’ for an element h of R with h(x) #0. Here >; gig/h! is in m2. 
The set of elements f in R such that fg is in m is an ideal in R that contains m,. 
and that contains h. Since h is not in m, and since m, is maximal, this ideal in R 
contains f = 1, and it follows that g is in m2. Consequently g is an isomorphism, 
and (+7) is proved. 


434 VII. Infinite Field Extensions 


PROOF OF REMAINDER OF THEOREM 7.23 FOR IRREDUCIBLE AFFINE HYPER- 
SURFACES. Let J be the principal ideal (f(X1,..., X,)), where f is irreducible. 
We saw in Example 2 above that dim V(/) = n — 1. The matrix that appears 
in (*) has only one row, corresponding to f, and hence it has rank 1 or rank 0. 
Substituting this fact into (*), we see that dimx (m, / m2) >n—l=dimV(/). 

Arguing by contradiction, suppose that strict inequality holds for every x in 
Vd). Then a) = 0 for all x € V(J) and for all 7. By Corollary 7.2, each 
ay is the product of f and a polynomial. Since the degree of aa in X; is less 
than the degree of f in X;, it follows that of = 0 for all j. In characteristic 0, 
this condition forces f to be constant and contradicts the assumption that f is 
an irreducible polynomial (and in particular the assumption that f is not a unit). 
In characteristic p, this condition forces each power of each Xj that occurs in 
f to be a multiple of p. That is, it says that f(X,,..., Xn) = g(X?,..., XP). 
Let Fr: K — K be the field map given by a + a?. This is onto K, since 
K is algebraically closed. Hence there exists a polynomial h(X,,..., X,) such 
that a =e, Them $C. 6.9 Xp) = BO oe Re) = ARs Xe) 
exhibits f as reducible, contradiction. Hence strict inequality cannot hold for all 
x € V(J), and some point of V (/) is nonsingular. 


6. Infinite Galois Groups 


In this section, K denotes a field, and K,jg denotes a fixed algebraic closure of 
K. We define Kep to be the subfield of all elements of Kai, that are separable 
over K. The field Ksep is called a separable algebraic closure of K. Theorem 
7.13 shows that Kajg is a purely inseparable extension of Kgep. If F) and F> are 
any fields with F, C Fo, then the group of all field automorphisms of F> fixing 
F is denoted by Gal(F2/F') and is called the Galois group of F> over F. 

The purpose of this section is to extend the theory of Galois groups to handle 
infinite extensions. Such an extended theory has at least two important applica- 
tions in the current context. A first application is to developments in algebraic 
number theory beyond what appears in Chapters V and VI. For example one way 
of viewing traditional class field theory for a number field F is that one forms 
Gal(Fiig/F), defines the maximal abelian extension F,, of F to be the fixed 
field of the closure of the commutator subgroup of Gal(Faig/F'), and asks for a 
description of Fy in terms of F. A second application is to the study of varieties 
over fields that are not algebraically closed. Ifa field K is given and a prime ideal 
Tin Kajg[X1,..., Xn] is specified by giving a finite set of generators, we can ask 
whether the same ideal can be defined via generators that lie in K. The given 
generators have coefficients in K,jg, and it is usually not obvious whether they 


6. Infinite Galois Groups 435 


can be adjusted to have coefficients in K. However, if Galois theory is available, 
then the question becomes whether the operation of each element of the Galois 
group Gal(Kaig/K) carries each generator into a member of the ideal,? and this 
question is decidable by methods to be discussed in Chapter VII. More generally 
algebraic geometry from before 1960 frequently worked with a field K and an 
algebraically closed field L that is larger than Kai, for example with K = Q and 
L =C-. Under the assumption that K is perfect and L is algebraically closed, 
Theorem 7.34 below shows that Gal(L/K) fixes only the elements of K, and thus 
Galois theory can still be used to decide in this situation whether a prime ideal in 
L[X,,..., Xn] is generated by members of K[X,,..., Xn]. 

The definition of “normal field extension” in Basic Algebra was limited to finite 
algebraic extensions, and the extensions were often assumed to be separable. We 
now drop both the finiteness assumption and the separability assumption: A field 
L with K C L C Kajg is said to be a normal extension of K if there exists 
some nonempty family { f;}ics of nonconstant polynomials in K[X] such that 
L is generated by K and all the roots in Kajg of all the polynomials f;. More 


specifically all the polynomials f; split in Kajg, say as fj(X) = ¢; ig ee (X —aij;), 


and L is to be the subfield of Kzig generated by K and all the roots a,;;. 


Proposition 7.24. The following conditions on a field L with K C L C Kalg 
are equivalent: 


(a) L is anormal extension of K, 

(b) Gal(Kaig/K) carries L to itself, 

(c) any K isomorphism of L into Kajg carries L to itself, 

(d) any polynomial f in KX] that is irreducible over K and has one root in 
L necessarily splits in L. 


ProoF. If (a) holds, let L be generated by K and elements aj; as in the 
paragraph before the proposition. If g is in Gal(Kaig/K), then y(q@;;) is a root of 
f, = f; because f; has coefficients in K. Hence a;; equals some aj. Thus y 
permutes the generators of L over K, and g(L) = L. Therefore (b) holds. 

If (b) holds, then any K field map of L into K,)g extends to a K automorphism 
of Kaig, by Theorem 9.23 of Basic Algebra. By (b), the extended mapping carries 
L into itself. Thus (c) holds. 

If (c) holds, let f in K[X] be irreducible over K, and suppose that xo is a 
root of f in L. Let x; be another root of f in Kaig. By the uniqueness of 
simple extensions, we know that there exists a K isomorphism @ : K (x0) > 
K(x1) © Kalg, and we can regard gp as a K field map of K (xo) into Kaig. The 
map ¢ extends to a K field automorphism of K,ig, and we restrict the extension 


°This condition is always necessary. For it to be sufficient, one has to show that the only members 
of Kalg fixed by all elements of Gal(Kalg/K) are the members of K. 


436 VII. Infinite Field Extensions 


toamapg: L > Kaj. By (c), p(L) © L. Since K(x9) © L, we obtain 
K (x1) = 9(K (x0) € @(L) © L. Thus x, is in L, and (d) holds. 

If (d) holds, then for each element x; of L, let f; be the minimal polynomial 
of x; over K. Certainly the field L is generated by K and the elements x;. By 
(d), each f; splits in L. Therefore L is generated over K by all the roots of the 
polynomials f; and is normal. Thus (a) holds. 


Proposition 7.25. Every member of Gal(Kaig/K) carries Ksep into itself, any 
two members of Gal(K,jg/K) that agree on Ksep are equal on Kajg, and any field 
map Of Ksep into Kajg extends to an automorphism of K,jg. Consequently the 
operation of restriction from Kaig to Ksep defines an isomorphism 


Gal(Kaig/K) & Gal(Ksep/K). 


PROOF. The first statement has three conclusions to it. For the first conclusion, 
if g is in Gal(Kyig/K) and if xo is in K sep, let f be the minimal polynomial of xo 
over K. By separability, f is a separable polynomial over K. Since @ fixes f, 
gy Carries Xp to some root x; of f, and hence f is the minimal polynomial of x; 
over K. Since f is a separable polynomial over K, x; is separable over K and 
lies in K sep. 

For the second conclusion, let g be a member of Gal(Kaig/K) that is 1 on K sep. 
If x is in Kajg, then the pure inseparability of Kaig/K sep implies that xP’ =a for 
some a € Kgep and some integer e > 0. The element x has (X — x)P = 
XP — xP° = XP — qa and hence is the unique root of X”" — a. Since g(x) has 
to be a root of this polynomial, g(x) = x. 

The third conclusion is a special case of the extendability to all of Kalg of any 
field mapping of a subfield of K,jg into Kalg. 

The displayed isomorphism follows: the first conclusion shows that restric- 
tion carries Gal(K,jg/K) into Gal(Kep/K), the second conclusion shows that 
restriction is one-one, and the third conclusion shows that restriction is onto. 


Corollary 7.26. Let L be a field with K C L C Keep, form Gal(L/K), and 
let LG"4/*) be the fixed field 


LOME/K) — fy EL | yx =x forall x € Gal(L/K)}. 


Then L is normal over K if and only if L@"4/*) = K, 


PROOF. Let L be normal over K, let x be in LS"4/*), and let f be the minimal 
polynomial of x over K. Since L is normal, f splits in L. Since L C Keep, the 
roots of f in L all have multiplicity one. Arguing by contradiction, suppose 
that x is not in K. Then deg f > 1, and f has another root x; besides x. 


6. Infinite Galois Groups 437 


Hence we can find a K isomorphism g : K(x) > K(x) with g(x) = x;. The 
mapping ¢ extends to a field automorphism of K,),, and Proposition 7.24 shows 
that g(L) = L, since L is normal. Thus ¢ defines by restriction a member of 
Gal(L/K). Since g(x) = x1, we have a contradiction to the assumption that x is 
in L@UL/K) — K. 

Conversely let LO“4/K) — K. Let x be in L, and let f be its minimal 
polynomial over K. Let x1 = x and x2,..., x, be the distinct images of x in 
L under members of Gal(L/K). These are all roots of f, and the roots of f 
have multiplicity 1 because x lies in K sep. Each member of Gal(L/K) permutes 
X1,...,x, and hence acts via a permutation in the symmetric group G,. Put 
g(X) =[]}_, (X — x;). Expanding g gives 


g(X) = X" — (Sox) X71 + (Y xixi)X"? —--- + (T] x). 


i i<j 


Each permutation of {x;,..., x,} fixes the coefficients of g(X), which are mem- 
bers of L, and hence the coefficients are in LO4/*) — K. Therefore g(X) is 
in K[X]. Since g(x) = 0, f(X) divides g(X). Over L, g(X) splits. By unique 
factorization in L[X], f(X) must split, too. By Proposition 7.24, L is normal 
over K. 


To obtain a version of the Fundamental Theorem of Galois Theory in the 
present context, it is necessary to introduce a topology on each Galois group. An 
example will illustrate. 


EXAMPLE. Let K be the finite field F,, where g = p’ for a prime p. If Ly 
is a finite extension of K of degree n, then Proposition 9.40 of Basic Algebra 
shows that Gal(L,,/K) is cyclic of order n, a generator being the Frobenius 
element Fr, defined by Fry(x) = x4. The thing about the Frobenius element is 
that it really makes sense on all L,,’s simultaneously. We know (from Proposition 
7.15 for example) that every algebraic extension of K is separable, and hence 
Ksep = Kalg. Here we can view Kgep as an aligned union of the fields L, for 
n > 1, and Fr, really makes sense as a member of Gal(Ksep/K) under the same 
definition: Fry (x) = x4. On each L,,, some nonzero power of Fry is the identity, 
but this is no longer true on the infinite field Ksep. Thus the mapping | +> Fry 
extends to a one-one homomorphism of Z into Gal(Kep/K). However, it is not 
onto. Any element y of Gal(Ksep/K) has the property that for each n, there is 
a unique integer k, with O < k, < n such that y| L, = = EE and the sequence 
{k,} determines y; nevertheless Problem 3 at the end of the chapter shows that 
the sequence need not ultimately be constant, and therefore y need not be in 
the image of Z. The Galois group Gal(Ksep/K) is instead a certain topological 
completion of Z that is usually denoted by vA Taking the topology into account 


438 VII. Infinite Field Extensions 


will be essential to extending the Fundamental Theorem of Galois Theory, since 
Z and Z are distinct subgroups of Gal(Kyep/K) that have the same fixed field, 
namely K itself. 


If L isanormal extension of K with L C Keep, we shall introduce a topology on 
Gal(L/K) to make “close” mean “equal on a large finite-dimensional subspace.” 
With this intuition as a guide, we could define a basic neighborhood of an element 
yo of Gal(L/K) by taking finitely many elements a, ...,@, in K and forming 

{y € Gal(L/K) | ya; = you; for 1 <i <n}. 
It is more useful, however, to define the topology in another way, and then it will 
turn out that we indeed would have obtained a neighborhood basis by the above 
definition. In any event, the topology turns out to be compact Hausdorff and to 
make Gal(L/K) into a topological group. 

The method we use will be to define the topology as an “inverse limit.” In- 
verse limit is a general notion in category theory defined by a universal mapping 
property. As usual it consists of an object and a morphism; it need not exist in a 
general category, but when it does exist, it is unique up to canonical isomorphism. 
For the category of interest, the objects are the compact (Hausdorff) topological 
groups, and the morphisms are continuous group homomorphisms. If we wanted 
to emphasize the category-theory aspects of the construction, we would also need 
products of this category with itself, but we shall not belabor this point. 

Let J be a directed set, i.e., anonempty partially ordered set under an ordering 
< such that for any a and b in J, there is an element c in J witha <candb <c. 
We allow ourselves to write b > a in place of a < b whenever convenient. Two 
examples of directed sets of particular interest both have J = {1, 2,3,...}; in 
one case the ordering is given by a < b if a divides b, and in the other case the 
ordering is given by the usual notion of inequality. 

An inverse system (/, {G;}, {fij}) in the category of compact topological 
groups consists of a directed set J, a system of compact topological groups G;, 
one for eachi € J, and a system of continuous homomorphisms fj; : Gj > Gi, 
defined whenever i and j are in J withi < j, such that 

e fii =1forallie/, 
e fij o fix = fix wheneveri < j <k. 


EXAMPLES. 

(1) Let J = {1, 2,3,...} with a < b meaning that a divides b. Let G, be the 
cyclic group Z/aZ of order a. Define fay : Gp — Ga to be the homomorphism 
such that fop(1 + bZ) = 1+ aZ. 

(2) Let J = {1, 2,3, ...} with the usual ordering. Fix a prime number p, and 
define G, to be the cyclic group Z/ p°Z of order p*. Define fa, : Gp > Ga to 
be the homomorphism such that f,,(1 + p?Z) = 1+ p%Z. 


6. Infinite Galois Groups 439 


An inverse limit (G, {f;};<7) of the inverse system (J, {G;}, {f;;}), often 
written G = limG; and sometimes also called the projective limit, consists 
<_ 


of a compact topological group G and continuous homomorphisms fj; : G > G; 
such that 
(i) fij ° fj = fi; wheneveri < j, 
(ii) whenever (G’, { f/}ic7) is a pair consisting of a compact topological group 
G' and continuous homomorphisms f/ : G’ > G; suchthati < j implies 
fi; ° fj = fj, then there exists a unique continuous homomorphism 
F : G’ > Gsuch that fj o F = f/ for alli. 


In the two examples the inverse limit group in the first case is Z; in the second 
case the inverse limit is isomorphic to the additive group Z, of p-adic integers. 
In the first case we omit a description of the homomorphisms f/f; : LZ /aZ. In 
the second case the homomorphisms /, are easy to describe: f, : Z, > Z/p°Z 
is given by the composition of the quotient homomorphism Z, — Z,/p“Zp and 
the isomorphism Z,/p*Z, — Z/p°Z asserted by Theorem 6.26e. 


Proposition 7.27. In the category of compact topological groups, an inverse 
system (/, {G;}, {fij}) has at least one inverse limit, namely (G, { fi}ier) with 


G= | (@idier € [[Gi| fij(g;) = gi whenever i < ij, 
iel 


fi = restriction to G of the i projection T] G; > Gi. 
Jj 


REMARKS. It is to be understood from the statement that G gets the relative 
topology from [[;_,; Gi. We refer to this (G, { fj}ic7) as the standard inverse 
limit of (7, {Gi}, (fi;}). 


ProoF. If (g;)je, and (g')jc, are in G, then the fact that each fj; is a homomor- 
phism implies that f;;(g;g;) = gig; and that fi; (g;') = g, |. Therefore (g:8/)icr 
and (g7Jier are in G, and G is a group. The subset of G; x G, with fj;(xj) = x; 
is topologically closed, and it follows that G is the intersection of closed sets and 
hence is closed. Since [];-; Gj is compact Hausdorff, G is compact Hausdorff. 
The continuity of the multiplication and inversion is a consequence of those 
properties for I]je , Gj. The i projection of [] jes Gj Onto G; is a continuous 
homomorphism, and hence so is the restriction of this projection to G. 

Condition (i) in the definition of inverse limit is immediate, and we have to 
prove (ii). Let (G’, {f/}ier) be given with each f/ : G’ — G; having the 
property that i < j implies fij o f; = f/. For each g’ in G’, the /-tuple 
(f/(g'))ier is a member of | [; Gi, and the map g’ +> (f/(g’))ie7 is continuous 
into the product topology because each entry is continuous. If i < j, then 


440 VII. Infinite Field Extensions 


the tuple (f/(g'))icy has the property that fii Fj (e’) = f/(g') because of the 
given compatibility condition for the f/’s. Therefore the map F given by g’ > 
(f/(g'))ier has its image in the subset G of |]; G;, and it is evidently a continuous 
group homomorphism. The map F' proves the existence assertion in (ii) because 
fio F(g') = fi(Cfi(e jer) = Af’). 

For uniqueness, suppose that H : G’ > G is a continuous homomorphism 
such that f; o H = f/ for alli. For each g’ € G’, we have f;(H(g’)) = f;(g’). 
Thus H(g’) is the member (g;)jcy of [];<, Gi for which g; = f/(g’) for all i. 
Hence H is uniquely determined. 


ie] 


Proposition 7.28. In the category of compact topological groups, any two 
inverse limits for an inverse system (/, {G;}, {fj;}) are canonically isomorphic. 


PROOF. This is a special case of the uniqueness in category theory of objects 
having a specific universal mapping property, as established in Basic Algebra. 


It is important in applications that the inverse limit of an inverse system of 
compact groups depend only on what happens far out in the directed set. We have 
not yet used that the indexing set is a directed set, rather than merely a partially 
ordered set, and we shall use this property now. 


Corollary 7.29. Let J be a directed set, let jg be in J, and let I’ be the set of 
members of / that are > jo. If (/, {G;}, {fi;}) is an inverse system of compact 
groups, then the two inverse systems (/, {G;}, {fi;}) and (U’, {Gi}, {fi;}) have 
canonically isomorphic inverse limits, the isomorphism of the standard inverse 
limit G C [];<,; G; onto the standard inverse limit G’ C [],. , G; being given 
by projection to the coordinates > jo. 


i> jo 


Proor. Let P : G — G’ be the projection, and let f/ : G’ > G; fori > jo 
be the associated maps. Certainly f/o P = f; fori > jo. We shall extend the 
definition of f/ to apply to alli € J. If i € J is given, we use the fact that / is 
directed to choose i’ with i’ > i andi’ > jo. Define f/ = fii o f/. Let us see 
that f/ is well defined. Let i” have i” >i andi” > jp. Choose i” with i” > i’ 
and i” > i”. The computation 


Fi fe) dei = Fi Oo Siti [e) oe = ii’ (e) bet 


shows that i’ and i” 


yield the same definition of f/, and a similar argument 

shows that i” and i” yield the same definition. Therefore i’ and i” yield the same 
definition. Thus f/ is now defined for all i in /. 

We shall show that (G’, { f/}ic7) is an inverse limit of (J, {G;}, {fij}), and then 

the corollary follows from Proposition 7.28. Property (i) of inverse limits is built 


into the definition of the homomorphisms f/. For property (ii) of G’, suppose that 


6. Infinite Galois Groups 441 


(G, { FA jie) 18 a pair consisting of a compact topological group G and continuous 
homomorphisms fi : G — G; such that i < j implies fi; © f = fi. By (ii) for 
existence with G, find a continuous homomorphism F : G > G with fjoF = fj 
for alli. Substituting from f/ o P = fj, we obtain f/ 0 (Po F) = fi, and this 
says that PoF : G — G’' is the map we seek for the existence in (ii) for G’. For 
uniqueness in (ii), suppose that F’ : G => G’ satisfies f,9F = fi for alli. Then 
fio F' = f/0 (Po F) fori > jo. By (ii) for uniqueness with G’, F’ = Po F. 
This says that the map from G to G’ in (ii) is unique. 


Let us now apply these considerations to topologize Galois groups of infinite 
separable normal algebraic extensions. The topologized Galois group will be the 
inverse limit of finite Galois groups, each with the discrete topology.'° 

We return to our field K, its algebraic closure K,jg, and its separable algebraic 
closure Ksep within Kz. Let L be a field with K C L C Keep, and assume that 
L/K is anormal extension, not necessarily finite. We shall topologize Gal(L/K). 
Let x be any element of L, and let F be the finite extension F = K (x) of K. If 
f is the minimal polynomial of x over K, then f has a root in Z and must split in 
L because L/K is normal. Let x;,..., x, be the roots of f, with x; = x. Then 
E = K(x,...,X,) is a finite normal extension of K with K CF CECL. 
Since x is arbitrary in L, L is the union of all the finite normal extensions of K 
lying within L. 

For each pair (E, E’) of normal extensions of K with K CEC E’ CL, 
Proposition 7.24 gives us restriction homomorphisms ggg : Gal(E’/K) > 
Gal(E/K). We write ve for the special case that E’ = L, so that gg, = Qe. 
IfK CECE'CE" CL, then gee 0 Oe” = Yee", and consequently the 
system 


E finite normal 
extension of K 7 , {Gal(E/K)}, {gee} 
inL 

is an inverse system of (discrete finite) topological groups. Meanwhile, we can 
form the group Gal(L/K) and the system {gz} of homomorphisms with gf = 
PEL- 


Proposition 7.30. With the above notation, the group Gal(L/K) may be 
identified with the underlying abstract group of the inverse limit iim. Gal(E/K), 


taken over finite normal extensions E/K with E C L, in such a way that the 
homomorphisms ~~ become the homomorphisms of the inverse limit. 


‘The inverse limit of a finite group is called a profinite group. Profinite groups have special 
properties by comparison with general compact groups, but it will not be necessary for us to undertake 
a study of them. 


442 VII. Infinite Field Extensions 


PROOF. Let G = iim. Gal(E/K), put Gg = Gal(E/K), and regard G as the 


standard inverse limit given as in Proposition 7.27: 
G= {Eye €[|zGe | VEE’ (Ye) = Ye whenever E C E'}. 


For each E,, we have a homomorphism gz : Gal(L/K) — Ge_, and the product 
of the values of these defines a homomorphism ® : Gal(L/K) —> [|,; Ge. The 
relations Qgg O Peg” = YEE” Show that the image of ® is contained in the 
subgroup G of [|[,, Gz. We shall show that ® : Gal(L/K) — G is one-one 
onto. 

Let us see that ® is one-one. If y ~ 1 is in Gal(L/K), then there exists x € K 
with y(x) ~ x. Let E bea finite normal extension of K within LZ containing x. 
Then Y lie # 1, and thus ge(y) ~ 1. Hence P(y) ¥ 1, and ® is one-one. 

Let us see that ® is onto G. Let (vz)g € G be given. For x in L, choose a 
finite normal E with x € E and E C L, and define y(x) = yg(x). The relations 
among the gf show that this definition of y (x) is independent of the choice of 
E, and y is therefore a field map of L into itself. Certainly y fixes K, and we 
can construct an inverse to y from the mappings y;, ' Thus y is in Gal(L/K). 
Application of ® gives ®(y) = (gge(y))z = (Ve) z, and ® is onto. 


Using Proposition 7.30, we transfer the topology from jim, Gal(E/K) to 


Gal(L/K), and we can now regard Gal(L/K) as a compact topological group. 
For any finite normal extension F of K with F C L, consider the group 
Gal(L/F). The inverse-limit topology identifies Gal(L/K) with a subgroup 
of [ [5x Gal(E/K), the product being taken over all finite normal extensions E 
of K contained in L, and Corollary 7.29 allows us to identify Gal(L/K) with a 
subgroup of 

[] Gal(E/K), 


EDF 


the product being taken over all finite normal extensions F of F contained in L. 
Under this identification Gal(L/F) is identified with the subgroup of elements y 
of the image of Gal(L/K) for which gr(y) = 1. Since ¢- is continuous, this is 
a closed set. In turn, this set equals the image of Gal(L/F) in the subset 


I] Gal(£/F). 


EDF 


The latter gives the standard inverse limit topology on Gal(L/F). Except for 
some details, the conclusion is as follows. 


6. Infinite Galois Groups 443 


Corollary 7.31. With the notation of Proposition 7.30, give Gal(L/K) the 
inverse-limit topology. If F is a finite normal extension of K contained in L, 
then Gal(L/F) is a closed subgroup of Gal(L/K’), and the relative topology on 
Gal(L/F) coincides with the inverse-limit topology of Gal(L/F). The subgroup 
Gal(L/F) of Gal(L/K) is anormal subgroup of finite index in Gal(L/K). Being 
a closed subgroup of finite index, it is an open subgroup. 


PROOF. We still need to prove that Gal(L/F) has finite index in Gal(L/K). 
Proposition 7.24 shows that the restriction to F of any member of Gal(L/K) is 
an automorphism of F. Since F is a finite extension of K, there are only finitely 
many possibilities for this automorphism. If two elements y and y’ of Gal(L/K) 
restrict to the same automorphism of F, then y~!y’ is a member of Gal(L/K) 
fixing F,, ie., a member of Gal(L/F). Thus y’ lies in the coset y Gal(L/F), 
and we conclude that there are only finitely many cosets. Since every member of 
Gal(L/K) restricts on F to an automorphism of F’,, the subgroup of members of 
Gal(L/K) restricting to the identity on F is a normal subgroup. Thus Gal(L/F) 
is normal in Gal(L/K). 


Corollary 7.32. With the notation of Proposition 7.30, Gal(L/K) has a system 
of open normal subgroups with intersection {1}. Hence the same thing is true of 
any closed subgroup of T of Gal(L/K). Moreover, if U is any open neighborhood 
of 1 in T, then some open normal subgroup lies in U; consequently the open 
normal subgroups of T form a neighborhood base about the identity. 


PROOF. The open normal subgroups in the first conclusion are the subgroups 
Gal(L/F) as in Corollary 7.31. Since every member of L lies in some finite 
normal extension of K within L, a member of Gal(Z/K) cannot lie in every 
Gal(L/F) unless it is the identity on L. 

Let U be an open neighborhood of 1 in the closed subgroup T of Gal(L/K). 
The set-theoretic complement U‘ of U in T is acompact set, and the complements 
of the open normal subgroups of T are open sets whose union covers U*, by the 
result of the previous paragraph. By compactness finitely many complements of 
open normal subgroups of T together cover U°. The intersection of these open 
normal subgroups is then an open normal subgroup contained in U. 


Theorem 7.33 (Fundamental Theorem of Galois Theory). Let K be a field, 
and let Kajg be an algebraic closure, so that K C Keep © Kalg. Let L be a 
normal extension of K lying in Ksep. Let S be the set of all closed subgroups of 
Gal(L/K), and let ¥ be the set of all intermediate fields between K and L. Then 
F ++ Gal(L/F) is a one-one mapping of F onto S with inverse § + L°, LS 
being the fixed field within L of the group S. 


PROOF. First we show that Gal(L/F’) is closed; Corollary 7.31 shows this only 
when F is a normal extension of K. Let {F.} be the set of all finite extensions 


444 VII. Infinite Field Extensions 


of K contained in F. Then F = |, Fy, and thus Gal(L/F) = (),, Gal(L/ Fy). 
Each F, is contained in a finite normal extension E, of K lying in L, and hence 
Gal(L/Fy) > Gal(L/E,). Corollary 7.31 shows that Gal(L/E,) is an open 
subgroup of Gal(L/K), and hence the larger subgroup Gal(L/F,) is open (as 
a union of cosets, each of which is open). Open subgroups are closed. Thus 
Gal(L/F,) is closed, and so is Gal(L/F) = (),, Gal(L/ Fw). 

Next if F is in Ff, then the inclusion L > F and the fact that L is normal over 
K together imply that L is normal over F. By Corollary 7.26, F = LO4/P), 
Hence F +> Gal(L/F) is one-one, and S +> L* is a left inverse of it. 

Finally we show that S +> L* is aright inverse by showing that Gal(L/L*) = S 
for any closed subgroup S of Gal(L/K). Define T = Gal(L/L*). Certainly 
S © T. The previous step shows that F = LS*4/") for all F € F. Taking 
F = LS gives LS = L@*C/L*) = LT. Let V be an arbitrary open normal 
subgroup of 7, and put E = L”. The members of T/V give well-defined 
automorphisms of E, and 


ET/V = Gyn = L? = L‘ = ay = ESV/V. (*) 


The group T/V is a finite group of automorphisms of F fixing K, and Corollary 
9.37 of Basic Algebra, when applied to the group T / V and the separable extension 
E/E"! , shows that T/V = Gal(E/E™/"). Similarly it shows that SV/V = 
Gal(E/E°Y/"). By (x), T/V = SV/V, ie., T = SV. Corollary 7.32 shows 
that the open normal subgroups of T form a neighborhood base about the identity 
of T. From the equality T = SV for arbitrary V, let us see that 


S is dense in T. (*) 


Arguing by contradiction, let g be in T but not in the closure of S. Find V small 
enough so that gV~!M § = @. From T = SV, we can write g = sv withs € S 
and v € V. ThensvV~!M § = @, and hence vV~!N S = @. This last equality 
is a contradiction, since the identity lies in vV~!, and (*«) is proved. Since S$ 
is closed, it follows from (**) that S = T. But T = Gal(L/L*) by definition. 
Therefore Gal(L/L°) = S, and the proof of the theorem is complete. 


Theorem 7.34. Let K be a perfect field, and L be an algebraically closed field 
containing K. Then the only members of L fixed by every element of Gal(L/K) 
are the members of K. 


PROOF. Proposition 7.15 shows that K sep = Kaig, and Corollary 7.26 implies 
that the only members of Kajg fixed by Gal(Kaig/K ) are the members of K. Thus 
we are done unless L contains elements not in Kajg. 

Let x and y be any two members of L not in Kajg, and let w be in Gal(Kaig/K). 
The singleton sets {x} and {y} are transcendence sets over Kg, and Lemma 7.6 


7. Problems 445 


shows that they can be extended to transcendence bases of L over Kajig. Call 
these transcendence bases E and F, respectively. Theorem 7.9 shows that E and 
F have the same cardinality. Therefore there exists a one-one function g of E 
onto F such that g(x) = y. This function g extends uniquely to a field map 
® of Kalg(E) onto Kaig(F’) that restricts to y on K,ig. Theorem 7.7 shows that 
L is an algebraic extension of K,jg(£) and of Kaig(/’); hence L is an algebraic 
closure of Kalg(E) and of Kaig(/’). The composition of ® followed by inclusion 
is a field map of Kaig(E) into L, and Theorem 9.23 of Basic Algebra shows that 
it can be extended to a field map ® of L into L. Since ®(L) is an algebraic 
closure of Kaig(F), (L) = L. Thus there exists a member ® of Gal(L/Kaig) 
with ®(x) = y such that ®| x. = vw. 

Taking yf to be the identity shows that no element of L transcendental over K 
is fixed by Gal(L/K). If an element z of Kajg is given that is not in K, then the 
first paragraph of the proof produces a member yy of Gal(K,ig/K) that moves z. 
Applying the result of the second paragraph to this y with x arbitrary and with 
y = x shows that y extends to a member of Gal(L/K) that moves z. 


7. Problems 


1. Let L/K bea field extension in characteristic p. Prove that the set of elements 
of L that are purely inseparable over K is a subfield of L. 


2. Incharacteristic p, let K (a) be an algebraic extension of a field K , and form the 
inclusions K C K(a?’) C K(a), where a” is the smallest power of a that is 
separable over K. Prove that the subfield of separable elements in the extension 
K(qa)/K consists exactly of K (a?*), i.e., that no separable elements of K (a) 
over K lie outside K (a”’ ). 


3. Partially order the positive integers by saying that a < b if a divides b. Let 
Z, { fa}a>1) be the inverse limit of the cyclic groups Z/aZ, with the homo- 
morphism f,, from Z/bZ to Z/aZ being given by fa,(1 + bZ) = 14+ aZ 
when a divides b. Each member c of Z defines a member z,. of Z, such that 
fac) = c + aZ for all a. Exhibit some other explicit member of Z. 


4. Prove that the only members of C fixed by all members of Gal(C/Q) are the 
members of Q. What members of R are fixed by Gal(R/Q)? 


5. By making use of the field K = Q v2, V3, J5, ST: ...), Show that there exist 
subgroups of Gal(Qaig/Q) of index 2 that are not open. 


Problems 6-14 concern primary ideals and make use of the notion of the radical /T 
of an ideal J as defined in Section 1. Throughout, R will denote a commutative ring 
with identity. A proper ideal J of R is primary if whenever a and b are in R, ab is 


446 VII. Infinite Field Extensions 


in J, and a is not in J, then b” is in J for some integer m > 0. It is immediate that 

every prime ideal is primary. 

6. Prove that an ideal J of R is primary if and only if every zero divisor in R/TJ is 
nilpotent (in the sense that some power of it is 0), if and only if 0 is primary in 
R/T. 

7. (a) Prove that if J is a primary ideal, then 7 is a prime ideal. (Educational 

note: In this case the prime ideal /T is called the associated prime ideal 
to J.) 
(b) Prove that if J is any ideal and if J C J for a prime ideal J, then VT Gal. 


8. (a) Show that the primary ideals in Z are 0 and (p”) for p prime andn > 0. 

(b) Let R = C[x, y] and J = (x, y”). Use Problem 6 to show that J is primary. 
Show that P = VT is given by P = (x, y). Deduce that P? GIS Pand 
that a primary ideal is not necessarily a power of a prime ideal. 

(c) Let K be a field, let R = K[X, Y, Z]/(XY — Z*), and let x, y, z be the 
images of X, Y, Z in R. Show that P = (x, z) is prime by showing that 
R/P is an integral domain. Show that P? is not primary by starting from 
the fact that xy = 2? lies in P?. 


9, Prove that if J is an ideal such that /7 is maximal, then / is primary. Deduce 
that the powers of a maximal ideal are primary. 


10. An ideal is reducible if it is the finite intersection of ideals strictly containing it; 
otherwise it is irreducible. 

(a) Show that every prime ideal is irreducible. 

(b) Let R = C[x, y], and let J be the maximal ideal (x, y). Show that I’ is 
primary and that the equality 7 = (Rx + 17) N (Ry + I’) exhibits J? as 
reducible. 

11. Prove that if R is Noetherian, then every ideal is a finite intersection of proper 
irreducible ideals. (The ideal R is understood to be an empty intersection.) 
12. Suppose that R is Noetherian and that Q is a proper irreducible ideal in R. Prove 

that 0 is primary in R/Q, and deduce that Q is primary in R. 

13. Prove that if Q1,..., Q, are primary ideals in R that all have /O; = P, then 

QO =();_, Q; is primary with /Q = P. 

14. (Lasker—Noether Decomposition Theorem) The expression J = ()};_, Q; of 
an ideal / as an intersection of primary ideals Q; is said to be irredundant if 
(i) no Q; contains the intersection of the other ones, and 
(ii) the Q; have distinct associated prime ideals. 


Prove that if R is Noetherian, then every ideal is the irredundant intersection of 
finitely many primary ideals. 


CHAPTER VIII 


Background for Algebraic Geometry 


Abstract. This chapter introduces aspects of the algebraic theory of systems of polynomial equations 
in several variables. 

Section | gives a brief history of the subject, treating it as one of two early sources of questions 
to be addressed in algebraic geometry. 

Section 2 introduces the resultant as a tool for eliminating one of the variables in a system of 
two such equations. A first form of Bezout’s Theorem is an application, saying that if f(X, Y) and 
g(X, Y) are polynomials of respective degrees m and n whose locus of common zeros has more 
than mn points, then f and g have a nontrivial common factor. This version of the theorem may be 
regarded as pertaining to a pair of affine plane curves. 

Section 3 passes to projective plane curves, which are nonconstant homogeneous polynomials in 
three variables, two such being regarded as the same if they are multiples of one another. Versions of 
the resultant and Bezout’s Theorem are valid in this context, and two projective plane curves defined 
over an algebraically closed field always have a common zero. 

Sections 4—5 introduce intersection multiplicity for projective plane curves. Section 4 treats a 
line and a curve, and Section 5 treats the general case of two curves. The theory in Section 4 is 
completely elementary, and a version of Bezout’s Theorem is proved that says that a line and a curve 
of degree d have exactly d common zeros, provided the underlying field is algebraically closed, 
the zeros are counted as often as their intersection multiplicities, and the line does not divide the 
curve. Section 5 makes more serious use of algebraic background, particularly localizations and the 
Nullstellensatz. It gives an indication that ostensibly simple phenomena in the subject can require 
sophisticated tools to analyze. 

Section 6 proves a version of Bezout’s Theorem appropriate for the context of Section 5: if F 
and G are two projective plane curves of respective degrees m and n over an algebraically closed 
field, then either they have a nontrivial common factor or they have exactly mn common zeros when 
the intersection multiplicities of the zeros are taken into account. 

Sections 7-10 concern Grébner bases, which are finite generating sets of a special kind for ideals 
in a polynomial algebra over a field. Section 7 sets the stage, introducing monomial orders and 
defining Grobner bases. Section 8 establishes a several-variable analog of the division algorithm for 
polynomials in one variable and derives from it a usable criterion for a finite set of generators to be a 
Grobner basis. From this it is easy to give a constructive proof of the existence of Grébner bases and 
to obtain as consequences solutions of the ideal-membership problem and the proper-ideal problem. 
Section 9 obtains a uniqueness theorem under the condition that the Grébner basis be reduced. 
Adjusting a Grébner basis to make it reduced is an easy matter. A consequence of the uniqueness 
result is a solution of the ideal-equality problem. Section 10 gives two theorems concerning solutions 
of systems of polynomial equations. The Elimination Theorem identifies in terms of Grobner bases 
those members of the ideal that depend only on a certain subset of the variables. The Extension 
Theorem, proved under the additional assumption that the underlying field is algebraically closed, 


447 


448 VIII. Background for Algebraic Geometry 


gives conditions under which a solution to the subsystem of equations that depend on all but one 
variable can be extended to a solution of the whole system. The latter theorem makes use of the 
theory of resultants. 


1. Historical Origins and Overview 


Modern algebraic geometry grew out of early attempts to solve simultaneous 
polynomial equations in several variables and out of the theory of Riemann 
surfaces. We shall discuss the first of these sources in the present chapter and the 
second of the sources in Chapter IX. 

Serious consideration of simultaneous polynomial equations of degree > 2 
dates to a 1750 book! by Gabriel Cramer (1704-1752), who may be better 
known for Cramer’s rule in connection with determinants. Cramer was interested 
in various aspects of the zero loci of polynomials in two variables with real 
coefficients. Thinking of the zero locus, we refer to a nonconstant polynomial in 
two variables as a plane curve. 

One of the problems of interest to Cramer was to find the number of points in 
the plane that would uniquely determine a plane curve of degree n up to a constant 
multiple. Cramer gave the answer 5n(n +3) to this problem. For example, when 
n = 2, if we normalize matters by taking the coefficient of x” to be 1, then the 
possible quadratic polynomials 


f(x,y) =x? +bxytcy?+dx+tey+f 


involve five unknown coefficients. Each condition f(x;, yi) = 0 gives a linear 
condition on the coefficients, and Cramer was able to write down explicitly a 
plane curve through the given points in question by introducing determinants and 
applying his rule to solve the problem. 

Already with this much description the reader will see a certain subtlety —that 
there will be special choices of the five points for which existence or uniqueness 
will fail. We could also ask about the effect of multiplicities: what does it mean 
geometrically to take two or more of the points to be equal, and how does such 
an occurrence affect the number of points that can be specified? 

Cramer noticed a subtlety that is less easy to resolve, even in hindsight. If we 
are given any two plane curves of degree 3, then Cardan’s formula says that we 
can solve one equation for y in terms of x, obtaining three expressions in x; then 
we can substitute for y in the other equation each of the three expressions in x and 
obtain a cubic equation in x each time. In other words, we should expect up to 9 
points of intersection for two cubics, and 9 should sometimes occur. (The various 


'G. Cramer, Introduction a l’Analyse des Lignes courbes algébriques, Chez les Fréres Cramer 
& Cl. Philibert, Geneva, 1750. 


1. Historical Origins and Overview 449 


forms of Bezout’s Theorem, which came a little later, confirm this argument.) The 
number of points that determine a cubic completely is 5n(n + 3) forn = 3,ie., 
is 9. Thus we have 9 points determining a unique cubic, and yet the second cubic 
goes through these 9 points as well. What is happening? This question has come 
to be known as Cramer’s paradox. 

Explaining this kind of mystery became an early impetus for the development 
of algebraic geometry. 

The question of the number of points of intersection had been the subject of 
conjecture for some time earlier, and it was expected that two plane curves of 
respective total degrees m and n in some sense had mn points of intersection. 
Etienne Bezout (1730-1783) took up this question and dealt with parts of it 
rigorously. The quadratic case can be solved by finding one variable in terms of 
the other and by substituting, but let us handle it by the method that Bezout used. 
If we view each polynomial as quadratic in y and having coefficients that depend 
on x, then we have a system 


ay + ayy +ay’ =0, 
bo + biy + xy? =0. 


Instead of regarding this as a system of two equations for y, we regard it as 
a system of two homogeneous linear equations for variables xo, x1, x2, where 
xo = 1, x; = y, x2 = y?. We can get two further equations by multiplying each 
equation by y: 


apy tay? +ay* =0, 
boy thy? +hy* =0, 


and then we have four homogeneous linear equations for x9 = 1,x; = y, x2 = 
y?, x3 = y>. Since the system has the nonzero solution (1, y, y*, y*), the deter- 
minant of the coefficient matrix must be 0. Remembering that the coefficients 
depend on x, we see that we have eliminated the variable y and obtained a poly- 
nomial equation for x without using any solution formula for polynomials in one 
variable. The device that Bezout introduced for this purpose—the determinant of 
the coefficient matrix — is called the resultant of the system and is a fundamental 
tool in handling simultaneous polynomial equations. With it Bezout went on in 
1779 to give a rigorous proof that when two polynomials in (x, y) are set equal 
to 0 simultaneously, one of degree m and the other of degree n, then there cannot 
be more than mn solutions unless the two polynomials have a common factor. 
This is a first form of Bezout’s Theorem and is proved in Section 2. 

In order to have a chance of obtaining a full complement of mn solutions, we 
make three adjustments—allow complex solutions instead of just real solutions 
(even in the case (m, n) = (2, 1) ), consider “projective plane curves” instead of 
ordinary plane curves to allow for solutions at infinity (even in the case (m,n) = 


450 VII. Background for Algebraic Geometry 


(1, 1) ), and introduce a suitable notion of intersection number of two plane curves 
at a point in order to take multiplicities into account (even in the case (m,n) = 
(2, 1) ). We shall allow complex solutions already in Section 2, and we shall make 
an adjustment for projective plane curves in Section 3. The issue of intersection 
multiplicity is more complicated. The beginnings of a classical approach to it 
are indicated in Section 4, and a somewhat more modern approach appears in 
Section 5. With the full theory of intersection multiplicities of projective plane 
curves in place, we obtain a general form of Bezout’s Theorem? in Section 6. 

The theory of the resultant can be extended in various ways, but we shall 
largely not pursue this matter. Studies of zero loci of systems of equations took 
amore geometric turn in the first part of the nineteenth century through the work 
of Julius Pliicker (1801-1858) and others, but these matters will be left for an 
implicit discussion in Chapter X. Instead, we skip to a development that began 
with the doctoral thesis of Bruno Buchberger in 1965. Buchberger was interested 
in being able to decide when a polynomial is a member of an ideal that is specified 
by a finite list of generators. For this purpose he learned that each ideal has a 
special finite set of generators that is unique once certain declarations are made. 
He devised an algorithm for determining such a set of generators,’ and he gave 
the name “Grobner basis” to the set, in honor of his thesis advisor.* The special 
unique such basis is called a “reduced Grobner basis.” 

An unfortunate feature of the algorithm (and even of later improved algorithms) 
is that Grébner bases are extraordinarily complicated to calculate. The timing 
of Buchberger’s discovery was therefore especially fortuitous, coming when 
computers were becoming more common, more economical, and more powerful. 

Buchberger was able to give a test for membership in an ideal in terms of 
a multivariable division algorithm involving any Grébner basis. Other general 
problems involving ideals were solvable as well. Because of the uniqueness of the 
reduced Grobner basis, two ideals are identical if and only if their reduced Grébner 
bases are equal. When some of the theory of resultants was incorporated into 
the theory of Grébner bases, these bases could also be used to address various 
questions of identifying zero loci. Other problems involving ideals could be 
addressed by similar methods. The theory has flowered tremendously since its 
initial discovery and by the present day has found many imaginative applications 
to applied problems. Sections 7—10 give an introductory account of this important 
theory. 


7A correct proof of the general form of the theorem seems to have been published for the first 
time by Georges-Henri Halphen (1844-1889) in 1873. 

3Devising the algorithm was Buchberger’s real contribution, since the abstract existence of the 
special set of generators is an easy consequence of the Hilbert Basis Theorem and had already been 
used in papers of H. Hironaka in 1964. 

4Wolfgang Grébner (1899-1980). The name is often spelled out as “Groebner,” particularly 
when it is used in connection with computer algorithms. 


2. Resultant and Bezout’s Theorem 451 
2. Resultant and Bezout’s Theorem 


Let A be a unique factorization domain. The case that A = K[X,,..., X,] for 
a field K will be the main case of interest for us. If f and g are polynomials in 
A[X] of the form 

FON =fot he Pet faxes 


g(X) = got eX +--+ BnX", 
with m and n both positive, then we let R(f, g) be the (m+n)-by-(m +n) matrix 
fo fi aa Sm—1 tm 0 0 0 PO 0 
0 fo ae Sm—2 fm-1 fn 0 0 aa 0 
Qe sae% fo ae 
£0 £1 5 En-1 En 0) area (@) m 
0 £0 Peoas En-2 En-1 8n eh (@) 
0 eee 80 Ry eee 8n 


in which there are n rows above the go in the first column and there are m remaining 
rows. The resultant of f and g is the determinant 


R(f, g) = det R(f, g). 


Theorem 8.1. If A is a unique factorization domain and if f and g are nonzero 
members of A[X] of the form f (X) = )7j2o fiX' and g(X) = D*7_ g)X/ with 
m > Oandn > 0 and with at least one of f,, and g, nonzero, then the following 
are equivalent: 


(a) f and g have acommon factor of degree > 0 in X, 
(b) af + bg = 0 for some nonzero a and b in A[X] with dega < n and 
deg b < m. 

(c) R(f,g) =0. 
Regard R(f, g) as a constant polynomial in X. When R(f,g) # 0, there 
exist unique a and b in A[X] such that a(X) f (X) + b(X)g(X) = Rf, g) with 
dega < n and degb < m. Both the polynomials a and b are nonzero if both 
f (X) and g(X) are nonconstant. 


REMARKS. The theorem says that af + bg = R(f, g) holds in every case 
for which at least one of the coefficients f,, and g, is nonzero. Sometimes the 
theorem appears in texts with the assumption that both coefficients are nonzero; 
in this connection, see Problem 5 at the end of the chapter. When R(/f, g) = 0, 
the theorem does not point to a useful way to identify a common factor; the 
division algorithm can be used for this purpose in some circumstances, but the 
use of Grobner bases as in Section 7 will be more helpful. 


452 VIII. Background for Algebraic Geometry 


PROOF. Let us prove the equivalence of (a) and (b). Suppose that (a) holds. 
If uw is a nonconstant polynomial in X that divides both f and g, let us write 
f = bu and g = —au. Thenaf + bg = 0. Also, dega + degu = deg g; 
since degu > 0, dega < degg <n. Similarly degb < m. Thus (b) holds. 
Conversely suppose that (b) holds, so that af = —bg with a and b nonzero and 
with dega < n and degb < m. Suppose that f;, 4 0. The equality af = —bg 
shows that f divides bg. Since degb < m = deg f, f cannot divide b. But 
A[X] is a unique factorization domain, and thus there is some prime factor p of 
f of positive degree such that p* for some k divides f but not b. Then p divides 
both f and g, and (a) holds. A similar argument works if g, 4 0. 

Now we prove the equivalence of (b) and (c). Let F be the field of fractions 
of A. We set up a one-one correspondence between polynomials a(X) in A[X] 
of degree at most n — 1 and n-dimensional row vectors (@ QQ, -:: Qy-1) 
with entries in A by the formula 


a(X) =a tayX +--+ + Oy) X" 1, 


and similarly we set up one-one correspondences for degrees at most m — 1 and 
at most m +n — 1 by the formulas 


b(X) = Bo BX fads bere 
CX) = yp tnx tee + ye, aan 


Examining the form of R(/, g), we see that the matrix equality 


(Go a = O%-1 Po --> Bmo-1) ROG 8) 
=(Y% MM ++) Ymtn-1) () 


holds if and only if the polynomial equality 
a(X) f(X) + b(X)g(X) = c(X). (*) 


holds. If (b) holds, then af = —bg, and (**) shows that c = 0. That is, 
(Yy VY °**  Ym+n—1) is the 0 row vector. Interpreting (*) as a matrix equality 
over F and assuming that a and b are not both 0, we see that the transpose 
of Rf, g) has a nontrivial null space. Therefore R(f,g) = detR(f, g) = 
0. This proves (c). Conversely if (c) holds, then we can find row vectors 
(@ O +--+ Q@,-1) and (Bo fy --- Bm-1) not both 0, having entries 
in F,, such that the left side of («) equals the 0 row vector. Clearing fractions, we 
mayassumethat(a@ a, --- Qy,-;)and(Bo 6; --: By) haveentries 
in A. Referring to («), we obtain af + bg = 0 with deg a at most n — 1 and deg b 
at most m — 1. We know that at least one of a and b is nonzero, and we have to 


2. Resultant and Bezout’s Theorem 453 


see that both are nonzero. The situation is symmetric in a and b. If a were to 
equal 0, then we would have bg = 0 and we could conclude that b = 0 because 
g #0. So we would obtain the contradiction a = b = 0. This proves (b). 

For the last statement of the theorem, suppose that R(f, g) 4 0. Then Cramer’s 
rule applied over the field of fractions F of A shows that the matrix inverse of 
Rf, g) is of the form 


Rf, g) | = Rf. g) Sf. 8), 


where S(f, g) is a matrix with entries in A. Consequently the row vector 


(RGB) Oe OV RGR 
has entries in A, and we can define members ao, ..., @n—1, Bo, .--, Bm—1 Of A by 
(Qo Oy = Gna Bo -**  Bmn=1) 
=(R(fig) 0 + O)RGF 8). 
Then (*) holds with (vy yoo-::) Ymtn-1) = (Rf. g) O --- 0), and 


the equality («*) shows that a(X) f(X) + b(X)g(X) = Rf, g). If both f and 
g are nonconstant, then neither a(X) nor b(X) can be 0, since otherwise the 
equation would show that R(f, g) is a nonconstant polynomial. 


Theorem 8.2 (Bezout’s Theorem). Let K be any field, and let f(X, Y) and 
g(X, Y) be nonconstant polynomials in K[X, Y], of exact respective degrees m 
and n. If the locus of common zeros of f and g in K* has more than mn points, 
then f and g have a nonconstant common factor in K[X, Y]. 


PROOF. For most of the proof, we assume that K is infinite. Arguing by 
contradiction, suppose that f and g both vanish at distinct points (x;, y;) for 
1 <i <mn-+1, and suppose that f and g have no nonconstant common factor. 
Since there are only finitely many members c of K such that y; — yj; = c(xj — xj) 
for some i and j withi ~ j and since K is assumed to be infinite, we can find 
c in K such that y; — yj; A c(x; — x;) for alli and j with i # j. For this c, 
yi — CX; F yj; — cx; wheni ¥ j, and therefore the second coordinates of the 
points (x;, yj — cx;) are distinct. The common zeros of f(X, Y) and g(X, Y) 
include the points (x;, yi), and thus the common zeros of f(X, Y + cX) and 
g(X, Y +cX) include the mn + 1 points (x;, y; — cx;) whose second coordinates 
are distinct. 

In other words, there is no loss of generality in assuming that the given 
polynomials f and g vanish at mn + 1 points whose second coordinates are 


454 VII. Background for Algebraic Geometry 


distinct. Regard f(X, Y) and g(X, Y) as members f(X) and g(X) of A[X], 
where A = K[Y], and write 


F(X) = fot fiX +--+ + faX™, 
o(X) = got giX +e + BX", 


with each f; and g; in A and with f,, 4 0 and g,, 40. Herem’ < mandn’ <n. 

Let us rule out the possibility that m’ = 0 orn’ = 0. Indeed, if we had m’ = 0, 
then the polynomial f would be nonzero and would depend on Y alone. Since 
f is nonzero and has degree m > 1, it has at most m roots. But we are assuming 
that f and g vanish at mn + 1 points whose Y coordinates are distinct, and the 
inequalities m < mn < mn + 1 therefore give a contradiction. Thus m’ 4 0. 
Similarly n’ 4 0. So Theorem 8.1 is applicable. 

Form the square matrix R(f, g) of size m’ + n’ and its determinant R(f, g). 
The latter is a member of K[Y], and Theorem 8.1 shows that it cannot be 0, since 
f and g are assumed to have no nonconstant common factor in K[X, Y]. 

Let us bound the degree of the member R(f, g) = det R(f, g) of KY]. Each 
term in the expansion of the determinant is of the form 


&: -[] RA Siew (*) 
1<i<m'-+n' 
for some permutation o of {1,...,m’ +n}. Here R(f, g)j; is given by 
biz for 1 <i <n’ and for j withi < j < m' +i, 
0 for 1 <i <n’ and for all other j, 
RF, 8)ii = Sjtn'-i forn’+1 <i <n'+m’ and for j 

withi <n’ + j <m' +i, 

0 for n’ +1 <i <n’ +m’ and for all other j. 


In addition, the degree of f;_; as a member of K[Y] is at most m — (j — 7), and 
the degree of gj4,/-; is at mostn — (j +n’ —i) = (n—n')+ (i — j). Setting 
j =o (i), we see that the degree of («) is at most 
> m-o®4+)+ PS ((n —n') + G@ — o(i))) 
1<i<n’ n'+1<i<m'-+n’ 


=mn' +m'(n—n') =mn—(m—m')(n—n') < mn. 


Thus R(f, g) is a nonzero polynomial in K[Y] of degree at most mn. Conse- 
quently it has at most mn roots. 

Theorem 8.1 shows that af + bg = R(f, g) for suitable members a and b of 
K[X, Y]. Recalling that f and g are assumed to vanish at mn + 1 points whose 
second coordinates are distinct, we see that R(f, g) vanishes at each of these 
second coordinates, and we arrive at a contradiction. 


2. Resultant and Bezout’s Theorem 455 


Now we can allow K to be finite. Let K’ be an infinite extension. We have 
just seen that f and g have a nonconstant factor in K’[X, Y]. Without loss 
of generality, this factor depends nontrivially on X. Theorem 8.1 applied with 
A = K’[Y] shows that R[f,g] = 0. The same theorem with A = K[Y] 
then shows that f and g have a common factor in A[X] = K[X, Y] depending 
nontrivially on X. 


Let us introduce some geometric language for the situation in Theorem 8.2. 
Affine n-space over a field K is the set of n-dimensional column vectors 


A" = Ak, = {1 --+5%n) € Kig} 


with entries in a fixed algebraic closure Ki, of K. The set of K rational points, 
or K points, in A” is the subset 


Ay = {@ise.5%,) SK" }. 


We shall comment on the appearance of K,jg in these definitions shortly. 

Members of A” are called points in n-dimensional affine space, and the func- 
tions P +> x;(P) give the coordinates of the points. If L is any field between 
K and Kajg, then any polynomial f in K[X1,..., X,] defines a corresponding 
polynomial function from A‘, into L. 

For algebraic geometry the case of interest for Sections 1-6 of this chapter is 
the case n = 2. The way of viewing a curve is influenced by Cramer’s thinking as 
discussed in Section 1: the particular polynomial that defines a curve is important, 
not just the zero locus in the affine plane, but two curves are to be regarded as the 
same if each is a nonzero multiple of the other. We can incorporate this viewpoint 
into algebraic language by defining an affine plane curve C over the field K to 
be any nonzero proper principal ideal? in KX, Y]. The curve is an affine plane 
line if the degree of any generator is 1. 

In practice in studying affine plane curves, there is ordinarily no need to 
distinguish between a polynomial and the principal ideal that it generates, and 
we Shall feel free to refer to an affine plane curve C = (f) as f when there is no 
possibility of confusion. 

The zero locus of a curve is the corresponding geometric notion, but it can 
readily be empty, as is the case with X? + Y? + 1 when K = R. On the 
other hand, the Nullstellensatz (Theorem 7.1) ensures that the zero locus will be 
nonempty if the underlying field is algebraically closed. Thus we define the zero 
locus V(C) = V((f)) of the curve C = (f(X, Y)) by® 


5Warning: This definition will be changed slightly in Chapter IX and again in Chapter X to 
reflect changed emphasis in those chapters. 

©The letter “V” is the letter that is commonly used in the notation for a zero locus. It stands for 
“variety,” a notion that we have not yet defined. But beware: not all objects labeled with a “V” are 
actually varieties the way the term is normally defined. An affine plane curve will turn out to be a 
variety exactly when the generating polynomial f is prime in Kajg[X, Y]. 


456 VII. Background for Algebraic Geometry 


VC) = Viu,(C) = {(@&, 9) € King | F@, y) = Of. 


This is the same as the set of all (x, y) such that every member of the ideal C 
vanishes at (x, y). The set of K rational points, or K points, of C is 


Vk (C) = Vx ((f)) = {@, y) € K | f@, y) = 0}. 


When we are content to refer to an affine curve C = (f) as f, we are content 
also to write V(f) in place of V(C) = V((f)). 

In Chapter X, under the assumption that K is algebraically closed, we shall 
extend these definitions from the case n = 2 and C as above to the case that 
n is general and C is replaced by any ideal J in K[x;,..., X,]. The set VW) 
of common zeros of the members of J in K" = Ki, will be called an “affine 
algebraic set.” The case of affine n-space itself arises when the ideal is 0. 

For general K, not necessarily algebraically closed, it is meaningful to consider 
the set Vx (1) of K rational points, i.e., the subset of common zeros lying in K”. 
For J = 0 and V(/) = A", the distinction between Vx (/) and Vx,,,(/) is hardly 
worth mentioning, but the distinction is well worth making for general J and is 
made for the case V(J) = A” for consistency. Although the study of sets Vx (/) 
is of importance in number theory, in geometry over R, and in other areas, we 
shall not pursue it in Chapter X for lack of space. 

Returning to Theorem 8.2, we see that the statement concerns Vx (C) Vx (D), 
where C and D are the principal ideals C = (f) and D = (g) in K[X, Y]. The 
theorem says that if Vx (C) MN Vx (D) contains more than mn points, then there is 
a nonzero principal ideal 4 with (h) C (f) M (g). 


3. Projective Plane Curves 


Section 2 dealt with intersections of affine plane curves. Even over an alge- 
braically closed field, two affine plane curves need not intersect. An example is 
the pair of straight lines X + Y — 1 and X + Y — 2, whose locus of common zeros 
is empty. To get these lines to intersect, we have to introduce “points at infinity.” 
The projective plane is the device for including such points. 

Let K bea field, and let Kaig be an algebraic closure. The projective plane 
over K is defined set theoretically as the quotient of K he — {0} by an equivalence 
relation: 

P= Ph, = {0 y,w) € Ki —101)/ ~, 


x 


where (x’, y’, w’) ~ (x, y, w) if (x’, y’, w’) = A(x, y, w) for some A € Kio: 


The set of K rational points, or K points, of P? is the quotient 


Pe = {(x, y,w) € K* —{0}}/~, 


3. Projective Plane Curves 457 


where (x’, y’, w’) ~ (x, y, w) if (x’, y’, w’) = A(x, y, w) for some 0 € K. 
When there is a need to be careful, we shall write [x, y, w] for the member of Pe 
corresponding to (x, y, w) in K? — {0}. But often there will not be such a need, 
and we shall simply refer to (x, y, w) as a member of P{.. Both P? and P%. have 
additional structure on them, given by “affine local coordinates,” and we come to 
that matter later in this section. 

Let us record briefly the obvious generalization of the projective plane to other 
dimensions: Projective n-space over K is defined set theoretically as the quotient 


P= Prag = {Qa, ..+,Xn41) € Kit = {0}} / ™~; 


where (i) 292% 45) Oa sei pee Asay Kea) SA 3 ee Re) 1Or 
somedA € K ae The set P) of K rational points of P” is the set defined in similar 
fashion using just nonzero vectors in K"*! and scalars in K*. 

Scalar-valued functions on P% are of little interest because they amount to 
scalar-valued functions of K” — {0} that are unchanged when (x1,...,%X,) is 
replaced by a multiple of itself. A polynomial of this kind, for example, is 
necessarily constant. Instead, the polynomials of interest that are related to P?, are 
“homogeneous polynomials.” A monomial in K[X),..., Xn+41] is a polynomial 
of the form Xj'---X/"*); its total degree is yt! j;. We say that a nonzero 
F in K[X,..., Xn41] is homogeneous of degree d > 0 if every monomial 
appearing in F with nonzero coefficient has total degree d. By convention the 0 
polynomial is homogeneous of every degree. We write K[X,,..., Xn+1]a for 
the set of homogeneous polynomials of degree d. Each such F satisfies 


POR 5255590) = MEO hee 


for all (x1, ...,Xn41) € K"*! and © K*. Conversely the fact that the mapping 
of polynomials into polynomial functions is one-one for an infinite field implies 
that homogeneous polynomials over an infinite field can be detected by this 
property. 

Let us assemble some further properties of homogeneous polynomials: The 
monomials of total degree d forma K basis of the vector space K[X1,..., Xn+41]a; 
this fact follows from the definition of polynomials over K. To calculate the 
dimension of K[X,,..., Xn+1]a, consider the problem of taking d factors X on 
which to place subscripts and using n dividers to separate the X,’s from the X’s 
and so on. The number of monomials in question is just the number of ways of 
selecting the n dividers from among the d + n symbols and dividers. Thus we 
obtain the important formula 


: d+n 
dimy KIX. .-+1 Xnsala = ( n ) 


Lemma 8.3. Any polynomial factor of a homogeneous polynomial over a field 
K is homogeneous. 


458 VII. Background for Algebraic Geometry 


PROOF. Write F = F\ F> nontrivially. Let d; and e; be the highest and lowest 
total degrees of terms in F,, and let d) and e2 be the highest and lowest total 
degrees of terms in F7. The product of the terms of total degree d; in F, and 
the terms of total degree d) in F2 is nonzero and is the djd2 total-degree part 
of F. The product of the terms of total degree e; in F; and the terms of total 
degree e2 in F2 is nonzero and is the e)é2 total-degree part of F. Since F is 
homogeneous, didz = e1é2. It follows that dj = e; and dz = e2; thus F\ and Fy 
are homogeneous. 


An ideal J in K[X,,..., X;41] is called a homogeneous ideal if it is the sum 
over d > 0 of its intersections with K[X,,..., Xn+ila: 
oe) 
T= QU OKIX,..., Xntila)- 
d=0 
The sum is to be regarded as a direct sum of vector spaces. For such an ideal, we 
can compute the quotient K[X,,..., Xn41]// term by term: 
CO 
K[X1,..., Xngil/P = QB Ki... Xngtla/U 0 KIX... Xngila): 
d=0 


We can often recognize a homogeneous ideal from its generators: an ideal with a 
set of generators that are all homogeneous is necessarily a homogeneous ideal. In 
fact, if an ideal J has homogeneous generators F;, then the most general member 
of J is a finite sum of terms A; F;. The terms of total degree d in A; Fj; are the 
product of F; with the terms in A; of total degree d — deg F;, and each such term 
is in J. Hence each member of J is a sum of homogeneous polynomials that lie 
in J, and the assertion follows. 

In the setting of P?, projective plane curves over K are initially defined to be 
nonconstant homogeneous polynomials in K[X, Y, W]. Although such polyno- 
mials are not well defined on the projective plane, their zero loci are well defined 
subsets of P?. As in the affine case, the particular polynomial that defines a curve 
is important, not just the zero locus, but two curves are to be regarded as the same 
if each is a nonzero multiple of the other. We can incorporate this viewpoint into 
algebraic language by defining a projective plane curve of degree d > 0 over 
the field K to be any nonzero proper principal ideal in K[X, Y, W] generated by a 
homogeneous polynomial of degree d. Such an ideal is necessarily homogeneous. 
In the special cases that d = 1, 2, 3, or 4, the curve is called a projective line, 
conic, cubic, or quartic respectively. 

Just as in the affine case, in practice in studying projective plane curves, 
there is often no need to distinguish between a homogeneous polynomial and 
the homogeneous principal ideal that it generates, and we shall feel free to refer 
to a projective plane curve C = (F) C K[X,Y,W] as F when there is no 
possibility of confusion. 


3. Projective Plane Curves 459 
If (F) is a projective plane curve of degree d, then its zero locus is denoted by 
V((F)) = Vig ((F)) = {Lx, y, w] € P? | F(x, y, w) = 0}. 


The locus 
Vk ((F)) = {[x, y, w] € Py | F(x, y, w) = 0} 


is called the set of K rational points, or K points, of the curve. When we allow 
ourselves to refer to the curve simply as F,, then we can write V(/) in place of 


V((F)). 


The affine plane Az = {(x, y)} has a standard one-one embedding into the 
projective plane P%. Namely we map (x, y) into [x, y, 1]. The set that is missed 
by the image is the set with w = 0, which is the set of K rational points of the line 
L with L(X, Y, W) = W, a line called the line at infinity. We shall denote this 
line by W. The points of Vx (W), i.e., those with w = 0, are called the points at 
infinity. 

Except for the line at infinity, lines in P% correspond under restriction exactly 
to lines in K*. Namely the projective line L(X,Y,W) = aX + bY +cW 
corresponds to the affine line /(x, y) = aX + bY +c, and vice versa. In certain 
ways the geometry of P% is simpler than the geometry of Az: 


(i) Two distinct lines in P%, intersect in a unique point. In fact, we set up the 
system of equations 


Since the lines are distinct, the coefficient matrix has rank 2. Thus 
the kernel has dimension 1, and there is just one point [x, y, w] in the 
intersection. 

(ii) Two distinct points in P% lie on a unique line. In fact, we set up the 
system of equations 


and argue in similar fashion. 


Along with the embedding of A%, into P%, is a correspondence between pro- 
jective curves and affine curves. Let us work with the polynomials themselves, 
without identifying each polynomial with every nonzero scalar multiple of itself. 
The passage from a nonzero homogeneous polynomial F(X, Y, W) of degree 


460 VII. Background for Algebraic Geometry 


d > Otoapolynomial f(X, Y) is given by f(X, Y) = F(X, Y, 1). The mapping 
F +> f is a substitution homomorphism, and it therefore respects products. 
However, the degree may drop in the process, and in particular f(X, Y) is a 
constant if and only if F(X, Y, W) is a multiple of W4. 

In the reverse direction if f(X, Y) is a polynomial of degree e, then f(X, Y) 
arises from a polynomial F(X, Y, W), but we have to specify the degree d of F 
and we must have d > e. Operationally we obtain F by inserting a power of W 
into each term of f to make the total degree of the term become d. For example, 
with f(X,Y) = ¥Y* + XY + X? if the desired degree is 3, then F(X, Y, W) = 
Y7W+XYW+X?. Onthe other hand, if the desired degree is 4, then F(X, Y, W) 
= Y°W? + XYW* + X?w. 

The formula for this reverse process is F(X, Y, W) = WwW! f(XW7!, YW). 
That is, F is given by a substitution homomorphism, followed by multiplication 
by a power of W. From this fact, we can read off conclusions of the following 
kind: 

If polynomials f(X, Y) and g(X, Y) are obtained from homoge- 
neous polynomials F(X, Y, W) and G(X, Y, W) by taking W = 1, 
then there exist integers r and s such that the polynomial 
W’ F(X, Y,W) + W°G(X, Y, W) is homogeneous and such that 
tT (X,Y) + g(X, Y) is obtained from it by taking W = 1. 


As we mentioned above, P7% has more structure than simply the structure of 
a set. About any point in PZ we can introduce various systems of “affine local 
coordinates.” The idea is to imitate what happens in the definition of a manifold: 
the whole manifold is covered by charts, each giving an invertible mapping of a 
set in the manifold to an open subset of Euclidean space. Here a single system 
of affine local coordinates plays the role of a chart; it puts AZ into one-one 
correspondence with the complement of the zero locus of a line in P. 

Let © be a member of the matrix group GL(3, K). Then ® maps the set K? 
of column vectors in one-one fashion onto K? and passes to a one-one map of 
P% onto P% called the projective transformation corresponding to ®. Two &’s 
give the same map of P%, if and only if they are multiples of one another. The 
group action of GL(3, K) on P%, is transitive because GL(3, K) acts transitively 
on K3 — {(0, 0, 0)}. 

If L is the projective line whose coefficients are given by the row vector 
(a b c) and if ® is is in GL(3, K), then the row vector (a b c)®7! 
defines a new projective line L®, and the K rational points of L® are given by 


Ve (L®) = ®(Vg(L)). 


In fact, let (:) be in Vx(L). Then @ = ® (:) is in ®(Vx(L)) and 


satisfies 


3. Projective Plane Curves 461 


x’ 
(a b c)@! (>) =0; 
wy’ 


! 


hence itis in Vx (L®). Conversely if ( yl ) isin Vx(L®), then ( y ) =o: ( y! ) 


satisfies 
x x’ 
(a b o(>)= b oat(y)=o 
w w’ 


and thus (: ) is ® of something in Vx (L). 


w’ 


To form the analog of a chart, fix [xo, yo, wo] in Pe Choose (by transitivity) 
some ® in GL(3, K) with ®(xo, yo, Wo) = (0, 0, 1). Then we can define affine 
local coordinates on ®='(K x K x {1}) to K? by the one-one map 


g(® (x, y, 1) = (x, y). 


This definition generalizes the standard embedding of the affine plane K? into 
P~ earlier; that embedding was the case ® = 1. 


EXAMPLES OF AFFINE LOCAL COORDINATES FOR P%.. 


10 —X0 
(1) Suppose (%0, Yo, Wo) = (Xo, yo, 1). We can choose ® = (« 1 “ } Then 
00 1 


x 1 0 —xo x x — Xo 
1 0 0 1 1 1 
In this case, the local coordinates are defined on 
®'1KxKxl)=KxKx1 


and are given by 


g(x, y, 1) = g(® | (@(x, y, 1) 
= 9(®!(x — x0, y— yo. 1)) = (x — X0, y — Yo). 


This ® is handy for reducing behavior about (xo, yo, 1) in Pp to behavior about 
(0, 0) in K?. 


462 VII. Background for Algebraic Geometry 


001 
(2) Suppose (xo, yo, Wo) = (0, 1, 0). We can choose ® = (00). Then 
010 
x 0 0 1 x w 
w 0 1 0 w 1 


g(x, 1, w) = o(@ '(®(x, 1, w))) = p(®'(w, x, 1)) = (w, x). 


and 


This ® is handy for studying behavior near one of the points at infinity in PX. 


We can use affine local coordinates to examine the behavior of a projective 
plane curve “near a particular point,” by which is meant “with that point as the 
center point in the analysis.” To examine behavior near (0, 0, 1), we use the 
correspondence f(X, Y) = F(X, Y, 1) that we discussed earlier. For a general 
point, we make use of the fact that whenever F is a homogeneous polynomial of 
degree d, thensois F o@7!. Toexamine the behavior of F near a point (xo, yo, Wo) 
in K? — {(0,0,0)}, we choose ® in GL(3, K) with ®(xo, yo, Wo) = (0,0, 1), 
and we define 

f(X, Y) = F(®"1(X, Y, 1)). 


Under this correspondence the behavior of F at (xo, yo, Wo) is reflected in the 
behavior of f at (0,0). We call f(X, Y) the local expression for F in the affine 
local coordinates determined by ®. This local expression is a polynomial in 
K[X, Y], and it is nonconstant unless F is a scalar multiple of (W o ©)? for 
some d. 


EXAMPLES, CONTINUED. 


10 —Xo 
(1) Suppose that (xo, yo, Wo) = (Xo, yo, 1) and that 6 = (« 1 “» } Compu- 


00 1 
x Xx +X09 
®'(y}=[ yt}, 
1 1 


and the corresponding local expression for a projective plane curve F is 


tation gives 


f(X, Y) = F(X + x0, ¥ + yo, 1). 
For the projective plane curve 
F(X, Y,W) =X°*Y+XYW4+W? 


and the same ®, the local expression f(X, Y) splits into homogeneous terms as 


3. Projective Plane Curves 463 


F(X, Y) = (ao y0 + xoy0 + 1) + OY + 2xoyoX + x0¥ + yoX) 
+ (yoX? + 2xoX¥ + XY) + (XY). 
We shall use this splitting in the next section in the first example of intersection 


multiplicity. 


001 
(2) Suppose that (xo, yo, Wo) = (0, 1, 0) and that 6 = (: 0 »). Then 


010 
Xx 
ypJ=H(1]. 
1 x 


and the local expression for a projective plane curve F relative to this ® is 
f(X, Y) = F(Y, 1, X). 
For the same projective plane curve F as in Example 1, namely 
F(X,Y,W) =X°Y+XYW+W°, 


we obtain 
f (X, Y) = (¥7 + XY) + (X3). 


We shall examine this example further in the next section. 


In this way we have associated to each projective plane curve F and to the 
system of affine local coordinates determined by a member ® of GL(3, K’) a local 
expression that is a nonzero polynomial in K[X, Y]. Conversely if the degree 
d and the member ® of GL(@, K) are given and if f in K[X, Y] is nonzero of 
degree at most d, then we can reconstruct a projective plane curve F of degree 
d whose local expression relative to ® is f. We have only to form the unique 
homogeneous polynomial G of degree d with f(X, Y) = G(X, Y, 1) and then 
putF =Go®. 


With these preparations in place, we return to a consideration of resultants and 
Bezout’s Theorem. Our objective is to rephrase Theorem 8.2 to take advantage 
of properties of the projective plane. 


Lemma 8.4. Let K bea field, let A be the polynomial ring A = K[x1,..., x;], 
and let f and g be members of A[X] of the form 


F(X) = fot fiX +--+ + fnX”, 
8(X) = got giX +--+ + BnX", 
where f; is a member of A homogeneous of degree m’ — j and g; is a member of 


A homogeneous of degree n’ — j. Then the resultant R(f, g) is a homogeneous 
member of A of degree mn! + m'n — mn. 


464 VII. Background for Algebraic Geometry 


REMARKS. In the application to proving Theorem 8.5, we will have m’ = m 
and n’ = n, and then R(f, g) is homogeneous of degree mn. Problem 8 at the 
end of the chapter concerns a situation for which m’ 4 m and n’ #n. 


PROOF. There is no loss of generality in assuming that K is algebraically closed, 
hence in particular is infinite. Each nonzero entry R(f, g)i; of Rf, g) is a coeffi- 
cient of f or of g. For each entry, define p(i, j) such that R(f, g)ij(tx1, ..., tx,) 


= 1?" IIR(f, g)ij(t1,..-, Xr). The assembled matrix R with powers of t in place 
is 
Of. Geof, se5, 
0 t fo 
: (x) 
t” g0 t” Res jee, 
0 t” go 


It turns out that there is a function g(i) such that r(j) = q(i) + p(i, j) depends 
only on j. Here r7” is the i™ entry of 


/ 


(t” yr} punt. pn’ yn} peer) 


fy: series Sg 


The matrix («) with 77 multiplying every entry of the i” row is 


tt fo aa re pe pm’ —m Fn 
0 t” —lym fo 
() 
1” 1” go gm ae pm ph g 
0 rau —1yn 20 
In (x), 27% is the j entry of (et, em'te'-1, ||. pm tn’—m—nt1) Then we 


have 
RG 2) xing tte aE ROG Be ves 


where u = 7; q(i) and v = })r(j). So 
ROX ates Fay) HE RCE Risers: 


In other words, R(f, g) is a homogeneous function. Since K is infinite, R(f, g) 
is homogeneous as a member of A. Computing u and v, we find that u = 
mm’ -+nn' $m(m 1) in(n 1) andv = (m+n)(m'+n')—4(m+n)(m+n-—1). 
Therefore v — u = mn’ + m'n — mn, and the degree of homogeneity of R(f, g) 
is mn’ + m'n — mn. 


3. Projective Plane Curves 465 


Theorem 8.5 (Bezout’s Theorem). Let K be a field, let Kaz be an algebraic 
closure, and suppose that F in K[X, Y, W],, and Gin K[X, Y, W],, are projective 
plane curves. Then their locus V(F) MN V(G) of common zeros in Pree is 
nonempty. If this zero locus has more than mn points, then F and G have as a 
common factor some homogeneous polynomial in (X, Y, W) of positive degree. 


REMARKS. Fortwo polynomials f(X, Y) and g(X, Y) in affine space, applica- 
tion of Theorem 8.1 concerning the resultant in the Y variable involves checking 
that at least one of the polynomials has the expected degree in the Y variable, and 
doing so may not be so easy. In the projective setting, this problem disappears 
if we apply a projective transformation and arrange that [0, 0, 1] not be on the 
zero locus of one of the given polynomials, say F(X, Y, W). In fact, if F is in 
K[X, Y, W]m, then the coefficient of W” has to be a constant, and this term is 
the only term of F that contributes to the value of F at (0, 0, 1). With the above 
adjustment the coefficient must be nonzero, and Theorem 8.1 is applicable. 


PROOF. Without loss of generality, we may assume throughout that K is 
algebraically closed. Write F and G in the form 


F(X,Y,W) = fot fiW+--:+ fnw” with fj; ¢ K[X, YIm_-j, 
G(X,Y,W)=goteiWt---+e,W" with g; € K[X, Y]n-;. 


Pick a point (x, y, w) at which F is nonzero, and move it to (0, 0, 1) by a projective 
transformation, so that F (0, 0, 1) 4 0. Regarding F and G as polynomials in W, 
with coefficients in A = K[X, Y], we form R(F, G), which Lemma 8.4 identifies 
as amember of KLX, YJinn- 

Since R(F’, G) is homogeneous as a member of K[X, Y] and since K is alge- 
braically closed, we can choose a point (xo, yo) # (0,0) with R(F,, G)(%o, yo) 
= 0. Then the resultant of F (xo, yo, W) and G(x, yo, W) is 0, and Theo- 
rem 8.1 applies because F'(xo, yo, W) has degree m in W. The theorem says 
that these two polynomials in W have a common factor. Since K is alge- 
braically closed, this common factor vanishes at some wo, and then we must 
have F (xo, Yo, Wo) = G(xo, yo, Wo) = 0. This proves the first conclusion. 

For the second conclusion, suppose that V(F')M V(G) contains mn + 1 points. 
Join these points by lines, and pick a point of P% that is not on any of the lines. 
We can do so because K, being algebraically closed, is infinite. Applying a 
projective transformation, we may assume that the point is [0, 0, 1]. Write F and 
G in the form («). Regarding F and G as polynomials in W, with coefficients in 
A= K[X, Y], we again form R(F, G), which Lemma 8.4 identifies as a member 
of K[X, Y]inn. For fixed (xo, yo), Theorem 8.1 says that R(F, G)(xo, yo) = 0 if 
and only if F (xo, yo, W) and G(x, yo, W) have a common factor (necessarily a 
common factor of the form W — wo because K is algebraically closed), if and 
only if F (xo, yo, Wo) = G(xo, yo, Wo) = 0 for some wo. So at each of our mn+ 1 


466 VII. Background for Algebraic Geometry 


points, say (x;, yj, w;), we have R(F, G)(cx;, cy;) = O for all scalars c. Since 
(x;, yi) # (0,0), RCF, G) vanishes on the line y;X — x;Y = 0. Consequently 
y;X — x;Y divides R(F, G) in K[X, Y]. 

Suppose that (x;, y;) is a multiple of (x;, yj) with i # j. Then (xj, y;, w;) and 
(xj, yj, w,) both satisfy y; X —x;Y = 0. Since (0, 0, 1) satisfies this also and since 
(0, 0, 1) is not to be on any of the connecting lines, we obtain a contradiction. 

Thus the mn-+ 1 factors y; X —x;Y are nonassociate primes in K [X, Y] dividing 
R(F, G). By unique factorization for K LX, Y], their product divides R(F, G). 
Since deg R(F, G) = mn, we conclude that R(F,G) = 0. Then Theorem 
8.1 shows that F and G have a nonconstant common factor in K[X, Y][W] = 
K[X, Y, W]. The common factor is homogeneous by Lemma 8.3, and the second 
conclusion is proved. 


4. Intersection Multiplicity for a Line with a Curve 


In this section we begin the topic of “intersection multiplicity” for projective plane 
curves. The idea is that the number of points in the intersection V(F') M V(G) in 
Bezout’s Theorem as formulated in Theorem 8.5 should actually equal mn, not 
merely be bounded above by mn, if the field is algebraically closed and the points 
are counted according to their “multiplicities,” whatever that might mean. 

The prototype is the factorization of a polynomial of degree n in one variable. 
The polynomial has at most n roots, and it has exactly n if the field is algebraically 
closed and each root is counted according to its multiplicity. In this case, as we 
well know, a root zo of f(z) has multiplicity k if (z — zo)* is the largest power of 
Zz — Zo that divides f(z). 

Our objective in this section is to develop a notion of intersection multiplicity 
for the case of a line and a curve at a point; the case of two curves is less 
intuitive and is postponed to the next section. The main result is to be that the 
sum of the intersection multiplicities at all points for a line and a projective 
plane curve equals the degree of the curve, provided that the underlying field is 
algebraically closed and that the line does not divide the curve. The statement 
in the previous paragraph about polynomials in one variable will amount to a 
special case; for this special case the projective line is Y, the projective curve is 
of the form W4—!Y — F(X, W), where F is homogeneous of degree d and where 
f(X) = F(X, 1), and the divisibility proviso is that F not be the 0 polynomial, 
Le., that f(z) not be identically 0. 

Let K be a field, let L be in K[X, Y, W],, and let F be in K[X, Y, Wa. 
The notation for intersection multiplicity will be 1(P, LM F), where P = 
(xo, Yo, Wo) is in Vk(F) MN Vx(L). To make the definition, we introduce affine 


4. Intersection Multiplicity for a Line with a Curve 467 


local coordinates. Choose ® in GL(3, K) with ®(xo, yo, wo) = (0,0, 1), and 
form the corresponding local expressions 


f (X,Y) = F(®1 (X,Y, D) = fi(X, Y) +++ + falX, Y), 
(X,Y) = L(®'(X, Y, D)). 


Here f; is the part of f that is homogeneous of degree j. Since /(0,0) = 0, 
we see that /(X, Y) = bX — aY for some constants a and b not both 0. Then 
g(t) = a) for t € K, is a parametrization of the locus in Ai on which 


I(x, y) = 0. The composition f (g(¢)) is a polynomial in ¢ with f(g(O)) = 0. In 
fact, 
FGM) = filat, bt) + folat, bt) +--+ + falat, bt) 


= tf\(a,b) +0? fra, b) +--+» +14 fala, b). 


There are two possibilities. If f o g is not the 0 polynomial, then f(g(t)) 
has a zero of some finite order at f = O, and this order is defined to be the 
intersection multiplicity, or intersection number, /(P, LO F). If f o@ is the 
0 polynomial, then we say that /(P, L 1 F) = +00. It will be convenient to 
define /(P, LQ F) = Oif P is not in Vx(L) NM Vg(F). We need to check that 
I(P, LMF) does not depend on the choice of ®, but we postpone this verification 
until after we consider two examples. 


EXAMPLES OF INTERSECTION MULTIPLICITY. 


(1) Example 1 in the previous section showed that relative to a suitable ® in 
GL(@, K), the projective plane curve 


F(X, Y,W) =X°*Y+XYW4+W? 
has local expression f (X, Y) about P = (xo, yo, 1) given by 


f(X, Y) = (xh yo + XoVo + 1) + (XGY + 2xoyoX + x0Y + yoX) 
As(ygX? + Oty XY XY) ReY) 


= fot fi(X, Y) + fo(X, Y) + fa(X, Y). 


For a line LZ, the intersection multiplicity /(P, LF) isO unless P lies in Vx (F), 
ie., unless fo = x6yo + Xoyo + 1 =0. Suppose that the line L is given by 


L(X, Y,W) =aX+fBY+yW, 
with local expression 


U(X, Y) = L(X + x0, Y¥ + yo, 1) = (axo + Byo + vy) + (@X + BY). 


468 VII. Background for Algebraic Geometry 


Here a and £ are not both 0. The intersection multiplicity 7(P, LM F) is O unless 
P lies also in Vx (L), i.e., unless wx9 + Byo + y = 0. Thus suppose that P lies 
in Vx(L)M Vx (F). Then we can parametrize the locus for which /(x, y) = 0 by 


@ = 9(t) = ‘ear and we obtain 


fi(g(t)) = fi(—Bt, at) = t(xpa — 2xoyoB + xoa — yoB), 
Plg(t)) = fo(—Bt, at) = t?(yoB* — 2xouB + af). 


One point lying in Vx(F) is P = (x0, yo, |) = (1, —}, 1), and P lies also 
in Vx(L) if a — 5B +y = 0, ie., if y satisfies y = 5B —a. Then we 
have fi(y(t)) = t(2a + 3B) and f(y(t)) = t°(—3? — a). Consequently, 
I(P,LOF)is> 1ifandonlyify = 5B —a. In this case, /(P, LO F) is > 2 if 
and only if 2a + 3 B =0,1e., ifa = —3£. When both conditions are satisfied, 
we have fo(y(t)) = P(—5B? — ap) = (4B), and this is not the 0 function 
because under these conditions, 68 = 0 would imply that (a, 6B, y) = (0, 0, 0); 
hence /(P, LN F) =2. 

(2) Example 2 in the previous section considered the point P = (xo, yo, Wo) = 
(0, 1,0) for the same F, namely F(X, Y,W) = X°Y + XYW+4 W?. This P 
lies in Vx(F). For a suitable ®, the earlier computations showed that the local 
expression for F is 

F(X, Y) = (¥7 + XY) + (x). 


The most general line L for which P lies in Vx(L) is aX + yW = 0, and the 
corresponding local expression is 


U(X, Y) =L(Y,1,X) =aY+yXx. 
We use the parametrization g(t) = (—at, yt) for L and obtain 

fe) =P? -ay) +P (-a*). 
By inspection we see that J(P, L 1 F) > 2 for all choices of @ and y, and that 
I(P,L OF) > 3 if and only ify = Oory =a. If y =Oory =a, thena? 
cannot be 0, and thus /(P, LN F) =3. 


Let us return to the verification that 7(P, LM F) does not depend on the choice 
of ®. Thus suppose that Y is another member of GL(3, K) with Y (x9, yo, Wo) = 


(0, 0, 1). Write 
a Bp O 
Wod!l= (> 5 0) 


rs 1 


4. Intersection Multiplicity for a Line with a Curve 469 


form the local expressions 


f(X, Y) = FOX, Y, D) = F(X, Y) +++ + 6%, ¥), 
(X,Y) = LK, Y, )) =o X ey, 


and parametrize the locus in AZ with I(x, y) = 0 by 
x inde y GL 
(5 )=0o=(S): 


Lemma 8.6. In the above notation, f(X, Y) equals 


We need a lemma. 


(7X +s¥ +19! f((aX + BY, yX + 6Y) 
+ (7X +sY¥ +1)?" f@X + BY, yX + bY) 
+--+ fi(aX + BY, yX + 6Y), 
and therefore 
fi(X,Y) = fi(aX + BY, yX + dY). 
PROOF. For the first conclusion, let us justify the following computation: 
f(X,Y) = (FoW (Wo &1)(X, ¥, 1) 
= (FoW!)(aX + BY, yX +6Y,rX +sY +1) 


=(FoW (0X +5¥ + D(H, FABY 1) 


= d X+BY X+6Y 
= (7X +s¥ +1) i ars rx sY 1) 


= (7X +8¥ 414A +++ fa) Geaba era) 
= (7X +sY¥ +1)? ' fix + BY, yX + 6Y) 
+(rX +s¥ +1947? (aX + BY, yX + 6Y) 
+++ fi(aX + BY, yX +4). 


In fact, the first three lines are valid if we make the computation in the field of 
fractions K(X, Y), the fourth line uses the homogeneity of F and a substitution 
homomorphism that evaluates members of K[X, Y, W] at points of K(X, Y, W), 
and the remaining lines use the homogeneity of f/,..., f and a substitution 
homomorphism that evaluates their arguments at points of K(X, Y). 


470 VII. Background for Algebraic Geometry 


This proves the first conclusion. To derive the second conclusion from it, we 
expand each of the coefficients on the right side and group terms of the same 
degree of homogeneity under (X, Y) +> (AX, AY). The only term whose degree 
of homogeneity is lis f/(aX +BY, yX+6Y) witha coefficient 1 coming from the 
expansion of (r X+sY+1)¢~!; all other terms have higher degree of homogeneity. 
When f (X, Y) on the left side is expanded as a sum of homogeneous polynomials, 
the term of degree | is f;(X, Y). The second conclusion follows. 


Continuing with the verification that /(P, L 1 F) does not depend on the 
choice of ®, we apply Lemma 8.6 to L in place of F,, and we obtain 


(X,Y) =I'(aX + BY, yX + 6Y). 
Since 1(X, Y) = bX — aY and I’'(X, Y) = b'X —a’Y, this equation shows that 
b=Da-day and —a=b)'p-d'6. 
Putting A = ad — By, we solve for a’ and b’ and obtain 
aa + Bb = Aad’ and ya+6b= Ab’. 
When x = at and y = Dt, we thus have 
ax + By =aat + Bbt =tAd’ and yx + dy = yat + dbt =tAbd’. 


Substituting these formulas into the first conclusion of Lemma 8.6 and using the 
homogeneity of each f; gives 


f@@) = (art + bst +14 tARi GB) 
+ (art + bst +.1)9-72?A2 fa’, D’) +. 447A fia’, BD’). 


If j is the smallest index for which fj(a’, b’) # 0, then the lowest power of 
t remaining on the right side after expansion of the coefficients is t/, and its 
coefficient is A/ af j (a’, b'). Thus we can conclude that the lowest power of ¢ with 
nonzero coefficient on the left side is ¢/, and its coefficient fj (a, b) must equal 
Aj f;(a’, b’). The equality of the lowest power of ¢ remaining on each side shows 
that 7(P, LO F) is the same when computed from f as when computed from f’, 
and we obtain as a bonus the formula f(a, b) = A/ fia’, bY) if t/ is that power. 
This completes the verification that /(P, L M F) does not depend on the choice 
of ®. 


Now we come back to the circle of ideas around Bezout’s Theorem. The first 
task is to clarify the meaning of infinite intersection multiplicity. 


4. Intersection Multiplicity for a Line with a Curve 471 


Proposition 8.7. Over the field K if a projective line L and a projective plane 
curve F meet at a point P in P2, then /(P, LM F) = +on if and only if L 
divides F. 


PRooF. If LZ divides F, then in the above notation the local expression /(X, Y) 
divides f(X, Y). Since /(g(t)) is the 0 polynomial, so is f (g(t). 

Conversely suppose that f(g(t)) is the 0 polynomial, so that f,(a, b) = 0 for 
allr with 1 < r < d = deg F. Without loss of generality, suppose b ~ 0. The 
equality 


0= f(a, b) = coa” + cja’—'b +-+-+¢,b" 
= b' (co(ab™!)’ +c\(ab"!)!| +--+ +e) 


says that Z — ab! is a factor of b’(coZ"’ +c, Z'~! + ---4+-¢,). If we write 
BI (coZ! +,Z" | +--+ +e,) = (Z — ab“ )u(Z) 
and take Z = XY~!, then 


BY f-(X,¥) = BY" (co( XY!) +e(XY7'! +++ +a) 
= Y' (xy! _ ab~')u(XY~') = bX, Y)(¥"~'u(XY~')). 


Hence /(X, Y) divides f,(X, Y) for all. It follows that /(X, Y) divides f (X, Y) 
and then that L divides F. 


The full-strength version of Bezout’s Theorem says that two projective plane 
curves F and G of degrees m and n meet in at most mn points even when 
multiplicities are counted, and that the number is equal to mn if K is algebraically 
closed and multiplicities are counted. This theorem will be proved in Section 6. 
For the time being, we shall limit ourselves to the special case of the full-strength 
theorem in which one of the curves is a line. 


Theorem 8.8 (Bezout’s Theorem). Let K be an algebraically closed field. If 
F is a projective plane curve over K of degree d and if L is a projective line such 
that L does not divide F, then }>p 1(P, LON F) =d. 


PROOF. First we show that 


yl I(P, LN F) < +00. (x) 
P 


Since L is assumed not to divide F,, Proposition 8.7 shows that [/(P, LM F) 
is finite at every point of Vx(L) N Vx(F). Thus }°p1(P,L 2 F) is finite if 


472 VII. Background for Algebraic Geometry 


there are only finitely many points in Vg (L) M Vx (F). Bezout’s Theorem in the 
form of Theorem 8.5 shows that either Vx(L) MN Vx (F) is finite or else L and 
F have as a common factor some homogeneous polynomial of positive degree. 
Since L has degree 1, L is prime, and thus L and F can have a common factor of 
positive degree only if L divides F. We are assuming the contrary, and therefore 
Vx (L)/O Vx (F) is finite. This proves (). 

Possibly by applying a projective transformation, we may assume’ that the 
given line L is the line at infinity W. Then the points P; with ](P;, WO F) > 0 
are of the form [x;, y;, 0]. Taking into account that the algebraically closed field 
K is necessarily infinite, we can apply a second projective transformation, one 
that translates the Y variable, and assume that no y; is 0. Then we can write 
P; =[r;, 1,0] withr; in K. Let us see that 


H(X) = F(X,1,0) is anonzero polynomial of degree exactly d. — (#*) 


In fact, F(X, Y, W) is homogeneous of degree d, and we have arranged that 
[1, 0, 0], which certainly lies in Vx (W), is not in Vx (F). Consequently the X a 
term in F(X, Y, W) has nonzero coefficient, and (+) follows. 

Next let us prove that 


I((r, 1,0), WM F) = multiplicity of r as a root of H(X) = F(X, 1,0). (4) 


Then it will follow that }°,7(P,W ™ F) equals the number of roots of 
H(X) = F(X, 1,0), each counted as many times as its multiplicity. In view 
of (>) and the fact that K is algebraically closed, we will then have proved that 
dp L(P, WN F) =d, as required. 
To prove (+), we introduce affine local coordinates about (7, 1, 0), using o-! = 
10r 
(« 01 ], so that B(r, 1,0) = (0,0, 1). The local versions f of F and / of W 


01.0 : 
relative to this ® are 


f(X, Y) = F(®'(X, Y, D) = F(X +17, 1,¥), 
1(X, Y) = W(®7!(X, Y, l)) = ¥. 


Hence /(X, Y) is of the form bX —aY witha = —1 and b = 0. If we parametrize 
I by g(t) = (at, bt) = (-t, 0), then 


f(@@) = f(t, 0) = F(-t +r, 1,0). 


If P and P’ are distinct points in P%,, then there exists a projective transformation carrying P 
to [1, 0,0] and P’ to [0, 1,0]. This transformation carries the unique line through P and P’ to the 
line at infinity. 


5. Intersection Multiplicity for Two Curves 473 


The order of vanishing of f(g(t)) at t = 0, which is I([r, 1,0],Wn F), thus 
equals the order of the zero of F(—t + r,1,0) at t = 0, which equals the 
multiplicity of r as a root of H(X) = F(X, 1,0). This proves (+), and the 
theorem follows. 


5. Intersection Multiplicity for Two Curves 


In this section we continue the topic of “intersection multiplicity” begun in Sec- 
tion 4. That section dealt with intersection multiplicity for the special case of a 
projective line and a projective plane curve, and the present section deals with 
the general case of two projective plane curves. The next section will use the 
general notion to address Bezout’s Theorem in full generality. In this section and 
the next we shall make occasional use of material from Chapter VII, especially 
Lemma 7.21 and the results in Section VIL.1. 

It is worth reviewing qualitatively what happened in Section 4. What we 
did was refer the given line and curve to affine space, parametrize the line in a 
natural way, and substitute the parametrization into the formula for the curve to 
obtain a scalar-valued function of one variable. The order of vanishing of the 
resulting scalar-valued function of one variable was defined to be the intersection 
multiplicity. The classical approach® for handling two curves proceeds by trying 
to generalize this construction, in effect parametrizing one curve and substituting 
into the other. The fact that there need be no natural parametrization of either 
of the curves leads to a number of complications, and ultimately the argument 
involves a complicated ring of power series. 

We shall follow a somewhat more modern approach? based on localizations.! 
The definition is not particularly intuitive, and it is necessary to study some 
examples to see its virtues. We give the definition, show that the definition is 
consistent with the definition in the special case of Section 4, check that the 
definition makes sense in general, state some properties that are useful in making 
computations, work out an example, and then verify the properties. Thus let F 
and G be homogeneous polynomials in (X, Y, W) of respective degrees m and n, 
and let P = [xo, yo, Wo] be a point of the projective plane Pe over a field K. We 
refer matters back to affine space in the usual way by letting ® be any member 
of GL(3, K) such that ®(xo, yo, Wo) = (0, 0, 1). The local expressions from ® 


0 


8 An account appears in Walker, Chapter IV. 

°See Fulton, Chapter 3, for the present section and Fulton, Chapter 5, for the next section. 

!0For a still more modern and more general approach, see Serre’s Algébre Locale. Serre’s opening 
sentence summarizes matters by saying, “Intersection multiplicities in algebraic geometry are equal 
to certain “‘Euler—Poincaré characteristics’ formed by means of the Tor functors of Cartan—Eilenberg.” 


474 VII. Background for Algebraic Geometry 


about (0, 0) corresponding to F and G are the polynomials f and g with 


f (X,Y) = F(®1(X, Y, 1), 
g(x, Y= G(@"!(X, Y, 1)). 


These polynomials break into homogeneous parts as 


f(®,Y) = fot fi(X, Y) +--+ + fin (X,Y), 
g(X,Y) =g0+ ai(X, Y)+---+en(X,Y), 


with f; and g; homogeneous of degree j in the pair (X, Y). We assume that P 
lies on the locus Vx (F:) 1 Vx (G) of common zeros of F and G, and the condition 
for this to happen is that fo = go = 0. The order of vanishing mp(F) of F at 
P is the first 7 for which f; is not the zero polynomial; we saw as a consequence 
of Lemma 8.6 that this quantity is well defined independently of the choice of ®. 

The intersection multiplicity 7(P, F OG) of F and G at P can be defined in 
either of two equivalent ways. The equivalence of the two definitions will be used 
repeatedly in the discussion and follows from the fact that localization commutes 
with passage to the quotient by an ideal, a fact that was proved as Lemma 7.21. 
One definition is 


I(P, FAG) = dimg ((KLX, YV/(f, 2) roby) 


where (K [X, YI/Cf, g)) (0.0) is the localization at (0,0) of the K algebra 
K[X, Y]/(Cf, g). That is, ‘we form the quotient ring of K[X, Y] by the ideal 
generated over K by f and g, localize with respect to the maximal ideal of all 
members of the quotient vanishing at (0, 0), and compute the dimension of this 
localization over K. The other definition is 


1(P, FG) = dimg (S~'KLX, Y1/S"'(f, 8), 


where S is the multiplicative system in K[X, Y] consisting of the complement of 
the maximal ideal (X, Y), i.e., consisting of all polynomials that are nonvanishing 
at (0, 0). Ineither case all elements of the ring being localized have interpretations 
as functions, and the multiplicative system consists of all the functions that are 
nonzero at a certain point. Nevertheless, the matter is a little subtle because 
some members of the multiplicative system in the first case may be zero divisors. 
Here is a lower-dimensional example of that phenomenon that can also serve as 
a guiding example for Theorem 8.12 below. 


5. Intersection Multiplicity for Two Curves 475 


EXAMPLE OF GEOMETRIC LOCALIZATION. R = (K[X]/((X?(X — 1)7))) aye 
with the subscript indicating localization at 0. Before passage to the localization, 
the quotient Q = K[X]/((X*(X — 1)*)) has dimension 4, with a basis consisting 
of the cosets of 1, X, X?, X°. The multiplicative system S for localization at 0 
consists of all members of the quotient that are nonzero at 0. The localization as a 
set consists of equivalence classes of pairs (r, s) with r in Q and s in S, two pairs 
(r, s) and (r’, s’) being equivalent if t(rs’—r’s) = 0 for some t in S. Localization 
is aring homomorphism, and we therefore consider the pairs (r, s) in the class of 
the additive identity. These have t(r1 — Os) = 0 for some t. Then ¢ andr have 
representatives t(X) and r(X) in K[X] such that t(X)r(X) = p(X)X?(X — 1)? 
for some p(X). Furthermore, t(0) 4 0. Then X? must divide r(X), and this 
condition is also sufficient for the choice t(X) = (X — 1)?. Thus the members 
X7q(X) of K[X] give 0 in the localization, and the localization is isomorphic to 
the 2-dimensional algebra K [X]/(X7). 


Proposition 8.9 below will show that /(P, F 1G) is independent of the func- 
tion ® used to introduce affine local coordinates. Assuming this independence, 
we begin with an example that shows that the definition is consistent with the 
definition in Section 4. 


EXAMPLE | OF INTERSECTION MULTIPLICITY. Case of a line LZ and a curve 
F homogeneous of degree d. Assuming that P lies in Vx(L) M Ve(F), we 
introduce affine local coordinates by means of a member ® of GL(3, K) that 
carries a representative of P to (0,0, 1), and we let /(X, Y) and f(X, Y) be 
the corresponding local expressions for L and F. Let f = fi +---+ fa 
be the decomposition of f into its homogeneous parts. Since the intersection 
multiplicity is being assumed to be independent of the choice of ® and since for 
any second point on a line through (0, 0, 1), there exists a ® that fixes (0, 0, 1) 
and carries that second point to (1, 0, 1), we may assume that /(X, Y) = Y. We 
introduce the parametrization (x, y) = g(t) = (t,0) for the line /(X, Y) and 
substitute into f(X, Y), obtaining f(g(t)) = fi(t,0) +---+ fa(t, 0). In the 
definition of Section 4, the intersection multiplicity is the least r such that f,(¢, 0) 
is not identically 0, or else it is +00 if f(g(t)) is identically 0. With the new 
definition we observe from the definition of r that f is of the form 


F(X, VY) = (6, X” +++ 40g X%) + ¥g(X, Y) =e, X" (1+ XA(X)) + Ye(X, Y) 


with c, # 0, g(X, Y) € K[X, Y], and h(X) € K[X]. The ideal in K[X, Y] 
generated by Y and f is the same as the ideal generated by Y and X" (14+ Xh(X)). 
Hence 


K[X, Y\/(Y, f) = KLX, Y/Y, X°( + Xh)) = K[X]/(X"(1 4+ Xh)). 


476 VII. Background for Algebraic Geometry 


The polynomial 1 + Xh(X) takes a nonzero value at 0 and hence is a member of 
the multiplicative system that we use to form the localization. Thus 


(KIX, YI/(%, f)) oo) = (KIXI/(X" 1 + XA))) 9) = (KIXI/(X)) ): 


The dimension of the right side is r, and thus the new definition of intersection 
multiplicity matches the old one. 


Proposition 8.9. The intersection multiplicity of two projective plane curves 
F and G at P is well defined independently of the member of ® that moves a 
representative of P to (0,0, 1). 


PROOF. It is enough to take P = [0, 0, 1] and to compare the effect of passing 
to affine local coordinates determined by the identity with the effect of passing 
to the coordinates determined by a general element ® of GL(3, K) of the form 

a BO 
d = (> 5 »). Let deg F = m and degG = n. If f(X, Y) = F(X, Y, 1) and 
‘ay rsl 
(X,Y) = F(®~!(X, Y, 1)), then the computation in the proof of Lemma 8.6 
shows that 


F(X, ¥) = (14 rX + syy™ F(a, PX). (*) 


Similarly if g(X, Y) = G(X, Y, 1) and 2(X, Y) = G(@~!(X, Y, 1)), then 


2? ~( aX+BY X+6Y 
g(x, Y) i d +rX + sY)" Cees anaes ar) 


Let 
a’ p’ 0 
Oe ce OEE, yes CPT and ® |! = (> BY °) : 


l+rX+sY ° Tr xXtsyY 3 oe 
r s 

It is purely a formal matter that the mapping T defined by (Th) (X, Y) = h(X’, Y.) 
is a field isomorphism of K(X, Y) onto K(X’, Y’). It sends K[X, Y] onto 
K[X’, Y’] and sends (KIX, ‘ala 9) onto (K[X’, Yl) 0)° Referring to the formu- 
las for X’ and Y’, we see that the image of K |X, Y] is contained in the localization 
(K [X,Y 1) 0.0)? by the universal mapping property of localizations, the image of 
(K[X, ¥]) 0.9) is contained in (K[X, Y]) 9). 
we see that (K[X’, Y’]) 99) S (KIX, ¥]) 9): 

Meanwhile, we can solve the equations defining X’ and Y’ for X and Y. If we 
compare the results with the formula for ®~', we find that 


Comparing these two conclusions, 


_ a’ X’+B'Y' _ y'X'+6'Y' 
X= Txsey and Y= toga: 


5. Intersection Multiplicity for Two Curves 477 


Thus the situation is symmetric, and we have (KIX, Y Diese Cc (K[X’, Val ar 
Consequently the mapping 


2 X+6Y X+6Y 
(Th)(X, Y) = ars anya rey) 


is an algebra automorphism of (K [X,Y 1) 0,0)" 
To prove the proposition, recall that localization commutes with passage to the 
quotient by an ideal. In view of (+), it is therefore enough to show that 


dim x (KIX, AV TLE 8)) 
= dime ((KLX, ha) Feary aCe +rX+sY)"Tf,(1+rX +sY)"Tg)). (**) 


The factor (1 + rX + sY) is a unit in (K[X, Y]) 
quotient algebra on the right side of (+) to 


(0,0) and we can simplify the 


In turn, this algebra is K isomorphic to (K [X,Y 1) 
automorphism of (K [X, Y 1) 


al Ch g) because T is an 
The dimensional equality in (**«) follows. 


(0,0)° 


Let us extend the definition of intersection multiplicity to include the case 
that the point of interest does not lie in the locus of common zeros. We define 
I(P,F OG) = Oif P is not in Ve(F) 9 Ve(G). Assume now that K is 
algebraically closed. Below we compute a fairly typical example of intersection 
multiplicity. To do so, we shall make use of certain properties of 1(P, F M G) 
that we list in Theorem 8.10 below. In fact, there is an algorithm for computing 
I(P, F 1G) using only these properties,!! but we shall not give it. 

Before stating the properties, we need to make some definitions. Recall from 
earlier in the section that the order of vanishing m p(G) of G at P is computed using 
a suitable ® in GL(3, K’) to refer G to affine local coordinates about P, defining 
g(x, Y= G(®7!(X, Y, 1)), expanding g(X, Y) as asum of homogeneous terms 
g(X, Y) = got e(X, Y)+---+8n(X, Y), and defining m p(G) to be the least j 
such that g; is not the 0 polynomial. The homogeneous polynomial g;(X, Y) is X/ 
times a polynomial in the one variable Y X~!, and the fact that K is algebraically 
closed implies that g; has a factorization of the form 


(X.Y) =e] |X + A 


‘Fulton, p. 76. 


478 VII. Background for Algebraic Geometry 


with c in K. Here j = }°,mj;, and the pairs (@;, B;) correspond to distinct 
members of P; that are uniquely determined up to indexing if c 4 0. Let 
1(X, Y) = a;X + B;Y, and let L; be the corresponding projective line. We 
refer to all the lines L; as the tangent lines to G at P, and we say that m; is the 
multiplicity of L;. The geometry of the situation is indicated in Problem 12 at 
the end of the chapter. 


Theorem 8.10. Let K be an algebraically closed field, let P be in P7, and let 
F and G be projective plane curves over K. Then the intersection multiplicity 
I(P, F 1G) has the following properties: 
(a) 1(P, FONG) =1(P,GNF), 
(b) 1(P, FAG) =1(P, FN (G+ HF)) for any projective plane curve H 
with deg HF = degG such thatG + HF £0, 
(c) [(P, FONG) > Oif and only if P lies in Vx (F) M Vg(G), 
(d) 1(P, FAG) < 1(P, AF 2 BG) for any projective plane curves A and 
B, with equality if A and B are nonvanishing at P, 
(e) [(P, F NG) is finite if and only if F and G have no common factor of 
degree > 1 having P on its zero locus, 
ff) 1(P, FAGH) =1(P, FAG)+1(P, FN A) and consequently if F = 
[]; 7)’ and G = I] G then 1(P, FOG) = 9); ,risj1(P, Fi G)), 
(g) [(P, FAG) = mp(F)m p(G), with equality if F and G have no tangent 
lines in common at P. 


REMARKS. Properties (a) and (b) are evident. Properties (c) and (d) are 
conversational and will be proved in these remarks. Properties (e), (f), and (g) 
require proofs, and we give those proofs after computing an example. For (c), if 
P lies in Vg (F) Vx (G), then the local expressions f(X, Y) and g(X, Y) vanish 
at O, and so does every member of the ideal (f, g); therefore (f, g) is a proper ideal 
in (K [X,Y 1) (0,0)? and the dimension of the quotient is positive. Conversely if P is 
notin Vg (F), say, then f (X, Y) lies in the multiplicative system S of nonvanish- 
ing polynomials at (0, 0), and S~'(f, g) = (1); hence S~'K[X, Y]/S7'(f, g) = 
0, and 1(P, FAG) = 0. For (d), S“!(af, bg) C S~'(f, g) with equality if a and 
b are nonvanishing at (0, 0), and hence S"'K[X, YI//S“'(f, g)isa homomorphic 
image of S~'! K[X, Y]/S~'(af, bg) and is a one-one homomorphic image if a and 
b are nonvanishing at (0, 0). 


EXAMPLE 2 OF INTERSECTION MULTIPLICITY. Let K = C, and let the two 
projective curves be the homogeneous versions of Y? = X? and Y? = X>. In 


other words, let 


F(X,Y,W)=Y°w-xX? and G(X,Y,W)=Y?W?—xX?. 


5. Intersection Multiplicity for Two Curves 479 


We compute /(P, F M G) for all points P in Ve(F) 1 Vg(G). In the affine 
plane the intersections (x, y) may be found by substituting the one equation into 
the other (or, with more effort in this case, by using the resultant). We obtain 
x° — x? = 0. This gives x7(x7 — 1) = 0. The factor x? — 1 has two distinct 
roots, and each gives two distinct y’s. Thus we obtain the five affine solutions 
(+1, +1), (-1, +i), (0,0). The fact that the first four occurred routinely with 
multiplicity 1 translates into intersection multiplicity 1 for each: In fact, (b) shows 
that 1(P, FAG) = 1(P, F 0 (W2F — G)), and W?F — G restricts at (X, Y, 1) 
to X° — X? = X3(X? — 1). At each of the points (+1, +1), X° — X? when 
viewed as equal to 0 has a vertical tangent X — 1 of multiplicity 1, while Y* — X? 
has a tangent that is not vertical. A similar argument applies at each of the points 
(—1, +i). By (g), the intersection multiplicity is 1 at each of the four points 
(+1, +1) and (—1, +i). 

Next let us consider (0, 0). The order of X° — X? is 3, and the homogeneous 
term of degree 3, namely — X°?, factors as the cube of a linear factor that gives 
the vertical line X. Meanwhile, Y? — X? has order 2 at (0,0), and Y? factors 
as the square of a linear factor that gives the horizontal line Y. The two curves 
have no tangents in common. Hence equality holds in (g), and the intersection 
multiplicity is 6 at (0, 0). 

Finally let us check points (x, y, w) on the line at infinity, i.e., those with 
w = 0. Putting w = 0 in the formula F = G = 0 shows that x = 0. Thus 
the only point of Vx (F) MN Vx (G) on the line at infinity is P = [xo, yo, wo] = 
[0, 1,0]. The local versions of F and G may be given in the variables X and 
W by restricting (X, Y, W) to (X, 1, W) and considering the polynomials about 
(x, w) = (0,0). As above, (b) gives 1(P, FOG) = 1(P, FN (W*F —G)), but 
F = Y?W — X? restricts to W — X? and W*F — G = —W?X? + X? remains 
unchanged upon restriction. The respective lowest-order terms, in factored form, 
are W and —X7(X + W)(X — W). None of the factors of the first polynomial 
matches a factor of the second polynomial, and (g) says that the intersection 
multiplicity is 1-5 = 5. 

The upshot is that we get multiplicity 6 from (0, 0), multiplicity 1 apiece from 
four other points in the affine plane, and multiplicity 5 from P = [0, 1, 0]. The 
total is 15, the product of the degrees of the given curves, as it must be if we are 
to have any chance of obtaining the desired generalization of Bezout’s Theorem. 


To get at Theorem 8.10, we make use of a structure theorem about ideals J in 
K[X,,..., Xn] for which V(J) is a finite set. To prove the structure theorem, 
which appears as Theorem 8.12 below, we first prove a lemma about the radical 
I of an ideal J, a notion defined in Section VII.1. 


Lemma 8.11. If R is a commutative Noetherian ring and / is an ideal in R, 
then (/T )™ C I for some integer m > 1. 


480 VIII. Background for Algebraic Geometry 


PROOF. Since R is Noetherian, the ideal 7 is finitely generated. Let 
{a1,...,,} be aset Of generators for it. By definition of radical, choose integers 


ki,...,Kn such that a! isin J for 1 < j <n, and putm = DB sayikyi pomue 
general element of ee is of the form a _,rja; with all r; in R. The m™ power 
of this element is a sum of terms of the form ral -qln with wel lj =m. " 
view of the definition of m, we must have /; > k; for some j. Then the factor a; j 


aoe 1 oe 
is in J, and hence the whole term ra}! --- a!" is in I. 


Theorem 8.12. Let K be an algebraically closed field, and let J be an ideal 
in the polynomial ring K[X,,..., X,] whose locus of common zeros in K” is 
a finite set {P,;,..., Py}. Then K[X,,..., X,]/J is isomorphic as a ring to the 
product of its localizations at the points P;: 


k 


K[X1,..., Xal/1 =] | (KX... Xnl/D)p,- 
j=l 


Consequently 


k 
dimg (K[X1,..., XnJ/D) = )) dimg (KLX1,«.., Xnl/D) p)- 
j=l 


REMARKS. The one-variable case is a guide: The ideal / is principal, and we 
can write K[X]/J as K[XV/( 1-1 (X —c;)/). The points P; of the theorem are 
the members c; of K, and the same argument as for the first example of the section 
shows that (KLXI/(T], (X—e))")) = K[X]/(X—c;)”). The isomorphism of 
the theorem therefore reduces to an instance of the Chinese Remainder Theorem. 


PROOF. Let yj : K[X1,..., Xn]/I > (K[X1, fess Mnl/d logs be the canoni- 
cal HOMOmOI Dy andletg = (¢1,..., @). The mapping g is aring homomor- 
phism into ea (K[X1, sles Xnl/1) (p> and we shall prove that g is one-one 
onto. Doing so requires some preparation. 

Let J; be the maximal ideal of all polynomials vanishing at P;. The Null- 
stellensatz (Theorem 7.1) shows that /7 consists of all f € K[X, Y] such 
that f vanishes at each P;, ie., that /7 = (Ves I;. Lemma 8.11 shows that 
(/T )™ ¢ I for some m, and thus (ay 4G) or For tA yg." + 7 is an 
ideal whose locus of common zeros is empty, and the Nullstellensatz shows that 
I" + I" = K[X,..., Xn]. The Chinese Remainder Theorem (Theorem 8.27 


I” and 


of Basic Algebra) therefore applies and shows that the intersection ae jai dj 


5. Intersection Multiplicity for Two Curves 481 


the product [ia I coincide. Similarly J; + J; = K[X,,..., X,], and hence 
(VS = Rea J;. Putting these facts together, we conclude that 


k k k k 
AST =a) Hit) Se (*) 
j=l j=l j=! j=! 
Let us now denote members of K[X1,..., X»] by uppercase letters and their 


cosets modulo J by the corresponding lowercase letters. Let us observe for 
1 <i <k that there exists F; € K[X1,..., Xn] with F;(P;) = 6;;. In fact, we 
start from the special case that if P ~ Q, then there exists F with F(P) = 1 
and F(Q) = 0. For the special case, P and Q differ in some coordinate; say that 
xi(P) 4 x;(Q). Then the polynomial 


F(X1,...,Xn) = (Xi — x1(Q)) @i(P) = 41(Q)) 


has the required properties. To construct F, with F\(P;) = 6,;, choose G; 
with G;(P}) = 1 and G;(P;) = 0. Then Fy = iw G; has F\(P,) = 1 and 
F\(P;) = 0 for j 4 1. The polynomials Fo, ..., Fy are constructed similarly. 

With m as in the second paragraph of the proof, fix j and define E; = 
1— (1 — F;")". This is divisible by F7” and hence lies in J” if i # j. In 
addition, 1 — Fe lies in J;, and hence 1 — Ej = (1 — 1S ae is in I. Therefore 
1- ar | OF ae oY oe visi E; lies in rm Since the left side is independent 
of j, 1— ey E; lies in ae I", and we conclude from (+) that 


k 
1- OE; lies in J. (**) 
i=l 


We just saw that E; lies in(),,; /;". Hence ifi # j, then E; E; lies in gre use 
I. Passing to cosets modulo J, we find from this fact and from (*) that 


k 
eje; =O fori F j, and that aan (7) 
i=l 


Multiplying the second equation by e; and substituting from the first equation, 


we obtain 


2 


ey =e; for alli. (+t) 


Using (7) and (+7), let us prove for each i that 


toeach G € K[X),..., X,] with G(P;) 40 
corresponds a polynomial H withhg =e;. (4) 


482 VII. Background for Algebraic Geometry 


In fact, we may assume that G(P;) = 1. Let Q be the member of J; given by 
Q = 1-—G. The element Q” £; is in J/" because Q is in J;, and it is in I 
for j # i because £; is in J” for j 4 i. Thus Q”£; is in (Va I” © I, and 
qe; = 0. Consequently 


gle, tqet+---+q” 'e) =U -qge(+qt-:-+q"') =e(1—-9") =6, 


and H = E;1+Q+4+---+ Q""') isa polynomial as in (+). 

Now we can prove that g is one-one. If f is a member of K[X,,..., X,]/I 
such that g(f) = 0, then g;(f) = 0 for all i. This means that there exists a 
member g; of the multiplicative system for localization at P; such that g; f = 0. 
Any corresponding polynomial G; has G;(P;) #4 0. By (4), there exists h; with 
hig; = e;. Then (7) gives f = ys ef = ys hig; f = 0. Thus ¢ is 
one-one. 

For the proof that g is onto, we recall that the multiplicative system used to 
obtain (K[X1, ies Xnl/1) (p, consists of the elements K[X,,..., X,]// that 


are nonzero at P;, and ¢; carries these to units in (K[X1, wees Kol Deas: Since 
: . . . . . J 

E;(P;) = 1, 9;(e;) is a unit. Fori 4 j, we have 9; (e;)g;(e;) = 9; (eie;) = 0, 

and therefore y;(e;) = 0. Consequently 


k k 
gj (ej) = i 9j(er) = 9( er) =e) = 1, 


=1 


~ 


and gj;(e;) is the identity of (K[X1, 1 Xal/D) (py: The localization at 
P; consists of the equivalence classes of all pairs (r;,5;) with r; and s; in 
K[Xj,...,Xn]/I and s; in the multiplicative system for index j. Thus let 
such pairs (r;,s;) be given for 1 < j < k. We are to produce an element a 
of K[X1,..., Xn]/I such that gj(a) = gj(rj)(y;(s;))~! for all j. Use of (4) 
produces h; with hjs; = e; for all j, and this element has the property that 
gj (hj) oj (sj) = 9;(e;) = 1, hence that gj(hj) = gj (sj). Consequently the 
element a = > ;rjhje; has the property that 


g(a) = vi ( DL rihiei) = dG Ti) Gj Ai) Gj (Ei) = gj (rj) (g;(s;))! 


and exhibits ¢ as onto. 


Corollary 8.13. Let K be an algebraically closed field, and let J be an ideal 
in the polynomial ring K[X,,..., X;,] whose locus of common zeros in K” is a 
finite set {P|,..., P,}. Then K[X,,..., X,]/J is finite-dimensional, and so is 
the localization (K[X1, hats Xnl/T) cp) for each j. 


5. Intersection Multiplicity for Two Curves 483 


PROOF. This is a corollary partly of the statement of Theorem 8.12 and partly 
of the proof. Let m be as in the proof. If Jo is the maximal ideal (X1,..., X») 
of K[X,,..., X,], then Jj’ is the ideal generated by all monomials of degree m, 
and K[Xj,..., X,]/Jj' is finite-dimensional. Consequently the maximal ideal 
Ty = (X1 — x1 (Pj), ..., Xn — Xn(P;)) has the property that K[X),..., Xn]/1" 
is finite-dimensional. Since //" + qi" = K[X,..., Xn] fori # j, the Chinese 
Remainder Theorem shows that 


k k 
Kites 61/10) PES Tl Riki ecg lye 
j=l j=l 


and the left side is therefore finite-dimensional. By (+) in the proof of Theorem 
8.12, ghee i” Cc I, and hence K[X,,..., X,]/J is finite-dimensional. Then 
(K [X1,...,Xpn)/1 ) (P) is finite-dimensional as a consequence of the statement 
of Theorem 8.12. 


PROOF OF THEOREM 8.10e. If F and G have a common factor H of degree 
> 1 such that H(P) = 0, we may assume that H is irreducible. Introduce affine 
local coordinates about P. If f, g, h denote the local versions of F, G, H, then 
the ideal (f, g) of K[X, Y] is contained in the principal ideal (h). The latter 
ideal is proper because h(0, 0) = 0, and the irreducibility of H thus implies that 
(A) is prime. If S denotes the multiplicative system in K[X, Y] of polynomials 
that are nonvanishing at (0, 0), then S~'(f, g) C S~!(A), and we have a natural 
quotient homomorphism of S~'K[X, Y]/S~'(f, g) onto S~'K[X, Y]/S7!(h). 
The latter is isomorphic as a K algebra to (K[X, Y]/(h))@,0), and the dimension 
of this localization is a lower bound for /(P, FG). Since K[X, Y]/(A) is an 
integral domain, K[X, Y]/(h) maps one-one into any localization of itself, and 
dimx (K[X, Y]/(h)) is a lower bound for [1(P, F NG). Since h is nonconstant, 
either X or Y actually occurs in it, say Y. Then A divides no member of K[X], and 
the mapping of K [X] into cosets modulo (h) is one-one. Therefore K[X, Y]/(h) 
contains a subalgebra isomorphic to K[X] and must be infinite-dimensional. 

Conversely if F and G have no common factor of degree > 1 with P on its 
locus, then (d) shows that we may assume F and G to have no common factor of 
degree > 1 of any kind. In this case Theorem 8.5 shows that the locus of common 
zeros of F and G is finite, and Corollary 8.13 shows that /(P, F 1G) is finite. 


PROOF OF THEOREM 8.10f. We are to prove that 
I(P,FOGHA)=1(P,FAG)+1(P,FO#). () 


If F and GH have acommon factor of degree > | that vanishes at P, then F and 
one of G and H have such a factor. By symmetry we may assume that F and G 


484 VIII. Background for Algebraic Geometry 


have that common factor. Then the left side of («) and the first term on the right 
are infinite by (e), and («) is verified. 

Thus we may assume that F and GH have no common factor that vanishes 
at P. If F has a prime factor that does not vanish at P, then (d) shows that we 
can drop that factor from all three appearances of F in («). In other words, it is 
enough to prove (f) under the assumption that F and GH have no common factor 
of degree > 1 of any kind. 

With this assumption in place, introduce affine local coordinates about P, let S 
denote the multiplicative system in KX, Y] of polynomials that are nonvanishing 
at (0,0), and let f, g, h be the local versions of the given curves F, G, H. The 
inclusion of ideals (f, gh) C (f, g) induces an inclusion S~!(f, gh) € S~'(f, g) 
and then an onto algebra homomorphism 


go: S'K[X, YV/S"(f, gh) > STKLX, YI/S'(f, 8). 


We shall exhibit a K vector-space isomorphism yw of S “I K[X,Y] /S hy f, 1) onto 
ker g, and the resulting dimensional equality 


dimg (S~'K[X, Y]/S"'(f, gh)) 
= dim (S"'K[X, Y]/S"'(f, g)) + dimg (S"'K[X, YI/S"'(f,h)) Ce) 


will prove («) and hence (f). We define 
W : S1K[X, Y] > S1KLX, YY/S"(f, gh) 


as a K linear map by W(u) = gu+S7'(f, gh). Ifaf +bh is in S~'(f, h), then 
W(af + bh) = afg +bgh+S~'(f, gh) = S7'(f, gh). Thus W descends to a 
K linear map of S~'K[X, Y]/S7~'(f, h) into S~'K[X, Y]/S7'(f, gh). It is 
evident that pW = 0 and hence that pw = 0, 1.e., image w C kerg. 

If any member u+ S~'(f, gh) of ker ¢ is given, then0 = g(u+ S~'(f, gh)) = 
u + S~'(f, g) shows that u is in S~'(f, g). Say that u = af + bg. Then 
w(b+S"(f,h)) = bg+S""(f, gh) = bg taft+S"(f, gh) =utS "Cf, gh) 
shows that image y > ker gy. Hence image y = ker g, i.e., y is onto. 

To see that y is one-one, suppose that w(u + S~!(f, h)) is the 0 coset, ie., 
that gu + S7'(f, gh) = S~'(f, gh). Then gu = af + bgh with u,a,b in 
S-'K[X, Y]. Clearing fractions, we may assume that u,a, b are in K[X, Y]. 
The formula g(u — bh) = af in K[X, Y], in the presence of the assumption that 
F and G have no common factor of degree > 1, implies that f divides u — bh. 
Write u — bh = cf withc in K[X, Y]. Thenu = cf + bh, and u lies in the ideal 
(f, 4). In other words, u + S$ ae f, h) is the trivial coset, and y has been shown 
to be one-one. This proves («*) and hence (f). 


5. Intersection Multiplicity for Two Curves 485 


Lemma 8.14. For any field K, let {L;};~; be a system of nonzero homogeneous 
polynomials in K[X, Y] of the form L; = a;X + b;Y, let {Mj}j>1 be another 
such system with M; = c;X +djY, and suppose that no L; is a scalar multiple of 
some Mj. Forn > 1, let Bo,..., By be the system of homogeneous polynomials 


B, = L,--- LM, ---Mn_x forO<k <n. 


Then {Bo,..., B,} is a vector-space basis of the space K[X, Y],, of all homoge- 
neous polynomials in (X, Y) of degree n. 


PROOF. The set {Bo,..., B,} has n + 1 elements, and n + 1 is the dimension 
of K[X, Y], because {X”, X”~!Y,..., Y”} is a basis. Thus it is enough to show 
that {Bo,..., By} is linearly independent. If we have a relation ys cp By = 0 
for scalars cz, then we observe that L; divides each B;, for k > 0, and L; does 
not divide Bo because by assumption L; does not divide any factor M;. Thus 
co = 0. In effect, case n of the lemma has now been reduced to case n — 1, and 
the result readily follows by induction. 


PROOF OF THEOREM 8.10g. Put p = mp(F) and q = mp(G). We pass 
to affine local coordinates about P, letting f and g be the members of KX, Y] 
corresponding to F and G. If J denotes the maximal ideal J = (X, Y) in K[X, Y], 
then f lies in J? and g lies in 17. We form the following sequence of K vector 
spaces and K linear mappings: 


K[X,Y]/I7@K[X,Y]/I? ue K[X,Y]/1?*4 Be K[X,Y1/U?*7+(f, g)) +0. 


Here the mapping ¢ is the algebra homomorphism induced by the inclusion 
[P*4 C IP+4 + (f, g), and it is onto K[X, Y]/(U?*4 + (f, g)). The mapping y 
is defined by 

Wiat+tI?,b+1") =af +bg+1°t4 


and is merely K linear. 
Let us see that the sequence is exact at K[X, Y]/J P+4. Since 


pw(at+I?,b+ I?) = gf +bg+1?*4) = 1°74 4+ (f, 8), 


we obtain image y C kerg. If h + /?*4 is in kerg, then h is in 1?*4 + (f, g), 
hence is of the form u + af + bg with u in 1?t4, Then h —u = af + bg, and 
Wiatl4,b+1?) =h—u+I1?*4 =h+ 1?*4, So image y D ker g, and we 
have image w = ker @. 

The mapping w descends to a one-one linear map of 


M = (KIX, Y]/I4 ® K[X, Y]/I’)/ ker 


486 VIII. Background for Algebraic Geometry 


into K[X, Y]/1?*4. The vector space K[X, Y]/I4 may be identified with the 
space of all polynomials of degree less than q, and that space is finite-dimensional. 
Similarly K[X, Y]/J? is finite-dimensional, and therefore 


dime M = dimg K[X, Y]/I4 + dimg K[X, Y/I” —dimg kerw. — («) 


Meanwhile, y exhibits K[X, Y]/(1?*4 + (f, g)) as isomorphic as a vector space 
to (K[X, Y]/1?t4)/M. Consequently 


dimg K[X, Y]/I’+4 = dimg M + dimg K[X, Y]/U?*4 + (f,g)). (®) 


Combining (*) and («) with the simple vector-space isomorphism K [X, Y]/J d= 


K[X, Y, W]g_; and with the fact from Section 3 that dimg K[X, Y, W]g_, = 
oe 


5 ) gives 

dimx K[X, Y]/U?*4 + (f, g)) 

= dimg K[X, Y]/I?"? — dimg K[X, Y]/I4 
—dimg K[X, Y]/I? + dimg ker y 


> dimg K[X, Y]/1?*4 — dimg K[X, Y]/I4 — dime K[X, Y]/I? 
SFI =) =e) 


= pq, (tT) 


with equality on the fourth line if and only if ker y = 0. 
The locus of common zeros of J?t4 + (f, g) is just {0}, and Theorem 8.12 
therefore shows that 


dimx (K[X, ¥]/U?*4 + (f, 8))) oo) = dime K[X, YIU? + (f,8)). CF) 


The inclusion (f, g) € 1?*4 + (f, g) induces an algebra homomorphism of 
(K[X, YI/(f, 8) (0,0) onto (K[X, Y]/U?*4 + (f,g))) (0.0) Therefore 


dimg (KIX, YI/(f.8)) 0,0) = dime (KIX. YIU? + (Ff. oo @ 


Let S be the set-theoretic complement of J = (X, Y) in K[X, Y]. Because of the 
isomorphism S~'K[X, Y]/S"'J = (KIX, YV/J)¢ o) for any ideal J, equality 
will hold in (4) if S~'(f, g) = S-'U?*4 + (f, g)). Combining (+), (7+), and 
(+), we find that 

IP, FOG) > pq, (£4) 


5. Intersection Multiplicity for Two Curves 487 
with equality if 
PI cS "(f,g) and — wis one-one. (§) 


Inequality (££) completes the proof of the inequality in (g) of the theorem. 
Because equality holds in (++) if (§) holds, we can complete the proof of all 
of (g) by showing that (§) holds if F and G have no tangent line in common. 

Thus for the remainder of the proof, we assume that F and G have no tangent 
line in common. Let the tangent lines of F’, repeated according to their multiplic- 
ities, be L;,..., Ly, and let the tangent lines of G be M;,..., My. Define L; for 
i > p tobe Lp, and define M; for j > q to be M,. 

In order to prove that the first conclusion of (§), namely that J?*4 C S$ et | Ff, 2), 
we shall prove that J’ C S~'(f, g) for f sufficiently large, and then we shall prove 
by induction downward on ¢ that J’ C S~'(f, g) as long ast > p+q. If f 
and g were to have a nonconstant common factor, then a tangent line for that 
common factor would be a tangent line for both f and g, and no such tangent 
line exists according to our assumption. Therefore Bezout’s Theorem (Theorem 
8.2) applies to f and g and shows that their locus of common zeros is finite. Let 
it be {(0,0), Qi,..., Q;}. The third paragraph of the proof of Theorem 8.12 
shows that there exists a polynomial h in K[X, Y] such that h(0,0) = 1 and 
h(Q;) =0 for 1 <i <J. Then Xh and YA vanish on {(0, 0), Q1,..., Q,}, and 
the Nullstellensatz (Theorem 7.1) shows that there exists N such that (Xh)" and 
(Yh) lie in (f, g). Since h is in the multiplicative system S$, X% and Y™ lie in 
s-l¢ f, g). Any monomial of degree > 2N contains either a factor X % ora factor 
Y", and consequently /*” C §~'(f, g). 

Proceeding inductively downward on ft, suppose that J’ C S~'(f,g) and 
thatt —1 > p+q. As in Lemma 8.14, the polynomials defined by By = 
Ly,---L,M,---M;-1-~ for0<k < t—1 forma vector-space basis of K[X, Y];—1. 
We show that each of these lies in S~!( f, 2); then we can conclude that J’ ae 
S~'(f, g), and our induction will be complete. Let f = fo + fori +--+ and 
& = 8q + 8qti1 +--+ be the expansions of f and g as sums of homogeneous 
polynomials in (X, Y). If By is given, then an inequality k > p would imply that 
B, contains a factor L; --- Ly; this is f, up to a constant factor. An inequality 
t—1—k > q would imply that B; contains a factor M, --- M,; this is g, up toa 
constant factor. Sincek < pandt—1—k < q would together imply the inequality 
t—1 < p+q that we are assuming not to be the case, one of the alternatives 
k > pandt —1—k > q must occur. Say the first occurs. Except for a constant 
factor, we then have By = f,C for some homogeneous polynomial C(X, Y) of 
degree t — | — p. Substituting for f, gives By = (f — fp41 —-+-)C. Each term 
fo+rC withr > 0 is of degree (p +r) + (t —1— p) > t — 1 and therefore lies in 
I' < S“'(f, g). Also, the term fC lies in S~'(f, g). Hence By lies in S~'(f, g). 
This completes the induction, and we conclude that J?t¢ C § me Sf, 2). 


488 VII. Background for Algebraic Geometry 


In order to prove the second conclusion of (§), namely that w is one-one, 
suppose thatO = W(a+14,b+ 1”) =af +bg+1°*4, i., that all terms of 
af + bg are of order > p+q. Writea =a, +4,4,+--- witha, 4 Oifa is not 
in 7, and write b = b, + bs4; +--- with b, # 0 if b is not in J”, so that 


af + bg =a, fp + bs8q + (higher-order terms). 


The right side is assumed to be in /?*4, which means that one of the following 
two conditions is satisfied: 

@) r+p=s+q<ptqanda, f, + bsg, =9, 

(ii) a, fy is in I?*4, and b,g, is in 1? 74. 
If (i) holds, then the facts that a, f, = —b;g, and that f and g have no tangent 
lines in common imply that f, divides b,. Since s < p, we must have b, = 0. 
Therefore a, = 0, and the conditions on a, and b, imply that a is in [% and b is 
in J?, which we are trying to show. If (ii) holds, then the fact that a, f,, is in I? +4 
implies that a- = 0 orr > q; in either case, a is in /?. Similarly the fact that 
bs gq = 0 implies that bs = 0 or s > p; in either case, b is in J?. We conclude 
that y is one-one, as was to be shown. 


6. General Form of Bezout’s Theorem for Plane Curves 


With the discussion complete concerning intersection multiplicity for general 
projective plane curves, we arrive at the general form of Bezout’s Theorem for 
plane curves. 


Theorem 8.15 (Bezout’s Theorem). Let K be an algebraically closed field, 
and let F and G be projective plane curves over K of respective degrees m and 
n. If F and G have no common factor of positive degree, then 


>> ICP, FAG) = mn. 


2 
PeP, 


REMARKS. The sum over P has only finitely many nonzero terms by Theorem 
8.5, and each intersection multiplicity in the sum is finite by Theorem 8.10e. 


PROOF. Theorem 8.5 shows that the locus of common zeros of F and G is a 
finite set. By applying a suitable ® in GL(3, K), we may assume that none of 
these zeros lies on the line at infinity, namely W. To do so, we choose a point P 
not in the finite set of common zeros. There are only finitely many lines passing 
through P and some member of the set of common zeros, and we choose a line 
through P different from all these. If © is chosen so as to move this line to the 
line at infinity W, then none of the common zeros will lie on the line W. 


6. General Form of Bezout’s Theorem for Plane Curves 489 


With this normalization in place, let {P,,..., Py} be the set of common zeros 
of F and G. We introduce local versions f and g of F and G by the definitions 
t(X, Y) = F(X, Y, 1) and g(X, Y) = G(X, Y, 1). Application of Theorem 8.12 
to the ideal J = (f, g) in K[X, Y] gives 


k k 
dimg K[X, Y1/(f, 8) = D dimg (KLX, YV/(f.8))p) = LP. FG). 
j=l ; j=l 


The theorem will therefore follow if we prove that 
dimx K[X, Y]/(f, g) = mun. (*) 


To prove (*«), we shall first prove a related equality concerning K[X, Y, W] and 
the ideal (F, G) in it, and then we shall use the fact that F and G have no common 
zeros With W to transfer the conclusion to K[X, Y]. 

Define K linear mappings g : K[X, Y, W]®@ K[X, Y, W] > K[X, Y, W] and 
w:K[X,Y,W]— K[X, Y,W]@ K[X, Y, W] by 


g(A,B)=AF+BG and wW(C)=(CG,-CF), 


and form the sequence of K vector spaces and K linear maps given by 
0— K[X,Y, W] > KIX, Y, W1@ KLX, Y, W] > KLX, Y, WI. () 


It is evident that w is one-one, that gy = 0, and that imageg = (F,G). If 
(A, B) is in kerg, then AF + BG = 0. Since F and G have no common factor 
of positive degree, F divides B and G divides A. Setting C = AG™! therefore 
gives A = CG and B = —AG~!F = —CF. Hence (A, B) lies in image w. In 
other words, (*«) is exact, and image g = (F, G). 

Let d > m+n. If we denote by Wg and gy the restrictions of y and @ to 
K[X, Y, Wla-m—n and K[X, Y, W]g-n ® K[X, Y, W]a_m, respectively, and if 
we go over the argument in the previous paragraph, then we see that the sequence 
O — K[X,Y,W]a—-m-—n ae K[X,Y,Wla_n ® KLX,Y,Wla-m —> KIX.Y,Wa 


is exact and that image gy = (F,, G)qg. The vector spaces in question here are all 
finite-dimensional, and thus we obtain 
dimx (F, G)q 
= dimx K[X, Y, W]g-n + dimg K[X, Y, W]a—m — dimg K[X, Y, W]a—m—n 
aint? d—m+2 d—m—n+2 
Se a a a4 2 ) 


#5) 


= —mn+dimgx K[X, Y, Wa. (tT) 


=—mn + ( 


490 VII. Background for Algebraic Geometry 


The ideal (F', G) is homogeneous, and thus we know from Section 3 that the image 
of K[X, Y, W]g in K[X, Y, W]/(F, G) is K[X, Y, W]g/(F, Gg. If we write 
(KIX, Y, WI/(F, G)), for this quotient, then (+) shows that 


dimg (K[X, Y, W]/(F, G)), =mn rH) 


foralld >m-+n. 

To prove (*) and the theorem, we shall translate (}+) into a conclusion about 
K[X, Y]/(f, g). Fix d > m+n, and let {V; + (F, G),..., Vian + (F; G)} be 
a K basis of (K[X, Y, W]/(F, G)) ,. Define vj(X, Y) = Vj(X, Y, 1) for each j. 
We shall prove that the vectors 


vi t+ (f, 8), -++5 Uma + Fg) (4) 


form a K basis of KX, Y]/(f, g). 

We need to make use of the fact that F and G have no common zeros on the 
line at infinity. Since W(F, G) C (F, G), the K linear mapping of multiplication 
by W on K[X, Y, W] descends to a K linear mapping L of K[X, Y, W]/(F, G) 
to itself defined by L(H + (F, G)) = WH + (F, G). Let us see that 


L: K[X, Y, W]/(F, G) > K[X, Y, W]/C(F, G) _ is one-one. (£4) 


In fact, suppose that WH = AF + BG for some H in K[X,Y, W]. For any 
U in K[X, Y, W], let Uo(X, Y) = U(X, Y,0). If U is homogeneous, then so 
is Up. In this notation we can write F = Fo + WM and G = Go+ WN for 
homogeneous members M and N of K[X, Y, W]. The polynomials Fo and Go 
are relatively prime: in fact, if Fo and Go have a nontrivial common factor Do, 
then we can regard Do as a projective plane curve, and it must have a common 
zero Q with W, by Theorem 8.5; but then F', G, and W have Q as acommon zero, 
in contradiction to the normalization in the first paragraph of the proof. Since 
WH = AF + BG implies Ao Fo = —BoGo, it follows that Fo divides Bo and 
that Go divides Ap. In other words, Bp = CoFo and Ap = —CoGo for some Co 
in K[X, Y]. If we define A’ = A+ CoG and B’ = B — CoF, then the formulas 
for Ao and Bo show that Aj = Bj) = 0. Hence A’ = WA” and B’ = WB” 
for some homogeneous polynomials A” and B”. Then WH = AF + BG = 
(A’— CoG) F + (B'+CoF)G = A’F + B'G = W(A"F + BG), and we obtain 
H = A"F + B’G. Thus H lies in (F, G), and (£4) is proved. 

Left multiplication L by W carries K[X, Y,W]q into K[X, Y, W]g41 and 
carries (F, G)g into (F, G)g4,. Therefore L is well defined as a mapping from 
(KLX, Y, W]/(F, G)), into (K[X, Y, W1/(F,G)),,,- Since it is one-one by 
(££) and since the spaces are finite-dimensional, it is onto. Therefore 


{W'V, + (F, G),..., W’ Vinn + CF, G)} is a basis (8) 


7. Grobner Bases 491 


of (KLX, Y, WI/(F, G)) a4, for every r > 0. 

To prove that (£) spans K[X, Y]/(f, g), let h be in K[X,Y]. Let H be 
a homogeneous polynomial in K[X, Y, W] with h(X, Y) = AH(X, Y, 1), and 
choose an integer s such that W* A lies in K[X, Y, W]a4, for some r > 0. Then 
we can write WSH = 5°" cj) W'V; + AF + BG for suitable scalars cj and 
homogeneous polynomials A and B. Restricting the domain to points (X, Y, 1) 
givesh = )°i" cjvj taf + bg, and therefore h + (f, g) = i", cjuj + (Ff, 8). 
This proves that (£) spans K[X, Y]/Cf, g). 

To prove that (+) is linearly independent, suppose that pea cjuj =af +bg 
with a and b in K[X,Y]. If A and B are homogeneous polynomials such 
that a(X, Y) = A(X, Y, 1) and D(X, Y) = B(X, Y, 1), then W” pa cjVj = 
W‘AF + W'BG, provided the exponents r,s,t are chosen to make the de- 
grees of the terms W’ ae cjVj, W*AF, and W'BG match. Consequently 
wr yr" c7V; lies in (F, G)a+r, and (§) shows that the coefficients are all 0. 
This proves that (+) is linearly independent. 


7. Groébner Bases 


The remainder of the chapter returns to the main question introduced in Section 1, 
that of how to get information about the set of simultaneous solutions of polyno- 
mial equations in several variables. The resultant introduced in Section 2 gave us 
one tool, but the tool is of most use when there are only two equations. Beyond 
two equations the number of cases to check quickly grows, and the resultant is of 
limited usefulness. '* 

The tool to be introduced in this section is of a completely different nature. 
Historically it was introduced in order to have a way of deciding whether an ideal 
in K[X,,..., X,] contains a given polynomial. We know from the Hilbert Basis 
Theorem that every such ideal is finitely generated, and it is assumed that the 
ideal to be tested is specified by such a set of generators. 

The proof of the Hilbert Basis Theorem gives a clue how to start studying an 
ideal of polynomials. In the statement of the theorem, R is a Noetherian integral 
domain, and / is a nonzero ideal in R[X]. It is to be proved that J is finitely 
generated. The proof by Hilbert is longer than the proof given in Basic Algebra, 
but the idea is clearer. To each nonzero member f(X) of J, we associate the 
coefficient of the highest power of X appearing in f(X). These coefficients, 
together with 0, form an ideal L(/) in R, and L(/) is finitely generated because 
R is Noetherian. Let a,,..., a, be generators, let f,(X),..., f-(X) be members 


!The nature of the extended theory can be found in Van der Waerden, Volume II, Chapter XI. 
Theorem 8.31 below in effect reproduces some of this extended theory in a context that is manageable 
because of the theory of Grobner bases. 


492 VII. Background for Algebraic Geometry 


of J with respective highest coefficients a,,..., a,, and let g be the largest of the 
degrees of f\(X),..., f,(X). Ifa general g(X) in J is given and if a € R is its 
highest coefficient, then we know that a = )°; cja; with c; € R. The polynomial 
h(X) given by h(X) = g(X) -— i; cj fi (X) X28 8-2 Fi has degree lower than 
deg g, and g(X) will be in (f1,..., f-) if h(X) isin (fi,..., f-). Iterating this 
construction, we see that it is enough to account for all the members of J of degree 
< q-—1. To handle these, one way to proceed is to enlarge the set {f1,..., fr} a 
little. For each k with O < k < q —1, let L;(/) be the union of {0} and the set of 
coefficients of X* in members of J of degree k. Each of these is an ideal of R and 
hence is finitely generated, and we adjoin to { f;,..., f}.a finite set of generators 
for each L;,(/) with O < k < q —1. The result is a finite set {g),..., g5} of 
generators of J, as one easily checks. 

In fact, the set {g1,..., gs} is a special set of generators. For any member f 
of R[X], let LT(f) be the complete term of f(X) containing the highest power 
of X. What the argument shows is that {g1,..., gs} is a subset of 7 such that 
LT) = (LT(g1), wales LT(gs)), where LT(/) denotes the ideal given as the linear 
span of all polynomials LT(g) for g in J. One can show that this property of 
{g1,..-, gs} implies that {g1,..., g,} generates J. In essence this property will 
be the defining property of a “Groébner basis” of J. It is not automatically satisfied 
for just any finite generating set {f),..., f-}, as the example below shows. We 
shall see that it is easy to use such a set of generators to test any polynomial in R[X] 
for membership in J. Thus the original problem historically for introducing such 
sets is solved except for one little detail: the proof of the Hilbert Basis Theorem is 
not constructive, and we are left with no idea how actually to construct a Grébner 
basis.!° 


EXAMPLE. Treat K[X, Y] as an instance of the above setting by letting 
R = K[Y] and regarding K[X, Y] as R[X]. Consider the ideal J = (fi, fo) 
in R[X] with f,(X, Y) = X* +2XY? and f(X, Y) = XY +2Y?— 1. Then 
(LT( fi), LTC fr) = (X’, XY), and every monomial appearing with nonzero 
coefficient in a member of the latter ideal has total degree at least 2. On the 
other hand, / contains the polynomial 


Yfi(X, Y) — Xfo(X, Y) = Y(X? +. 2XY) — X(XY 4+ 2¥°-1 =X, 


and its leading term is X, whose total degree is 1. Thus LT(/) properly contains 


(LT(f1), LT(f2)). 


Because of the nonconstructive nature of the proof of the Hilbert Basis Theo- 
rem, it is necessary to start afresh. One message to glean from the abstract proof 


'3The exposition in this section and the next three is based partly on the book of Cox-Little— 
O’Shea and a now-defunct Web tutorial of Fabrizio Caruso. 


7. Grobner Bases 493 


is that the leading terms of the members of J are important and somewhat control 
the nature of J. To handle K[X,,..., X,] when K is a field, it is of course 
necessary to use an additional induction that enumerates the variables. In the 
example above, we treated X as more significant than Y. For the inductive step 
for general K[X1,..., X,], the ring R in the above argument is K with some 
number m of the indeterminates included, and X is the (m + 1)‘ indeterminate. 
Putting all the steps of the induction together, we see that the order in which the 
variables are processed appears to be important. 

The theory of Grébner bases as it has evolved allows a healthy extra measure 
of generality. Instead of defining leading terms by insisting on an ordering of the 
indeterminates, it defines them by using a suitable kind of ordering of monomials, 
and that is where we begin. Let K[X,,..., X,] be given, K being a field. Let 
M be the set of all monomials in K[X,,..., X,]. A monomial ordering < on 
M is a total ordering'* with the two additional properties that 

Gi) M; < M implies M, M3 < M>M; for all M,, Mo, M3 in M, 

(ii) 1 < M forall Min M. 
We write M>. > M, to mean M; < M>. Also, M; < M2 means M; < M> with 
M, 4 Mo, and M; > M> means M, > M) with M; # Mp. 


EXAMPLES OF MONOMIAL ORDERINGS. Each ordering assumes that the vari- 
ables are enumerated in some way. In these examples we take this enumeration 
to be X;,..., X,. The first four examples all have the property that the largest 
Xj; is X; and the smallest is X,,. 


(1) Lexicographic ordering, abbreviated as “lex” by many authors and written 
as < in this list of examples. This, the most important monomial ordering, is 
already suggested by the proof of the Hilbert Basis Theorem. In principle it can 
be used for all purposes in Sections 7-10, but one application in Chapter X will 
require a different monomial ordering. Its disadvantage is that it sometimes makes 
lengthy computations take longer than necessary; this matter will be discussed 
more in Section 9. The definition is that X : i X in <LREx X ’ rr, ¢ Jn if either 
the two monomials are equal or else the first k for which i, ~ jy has ip < je. 
Thus for example, X (xox : <LEx Xx a The word “lexicographic” refers to the 
dictionary system for alphabetizing in which a first word comes before a second 
word if for the first position in which the two words differ, the letter of the first 
word in that position precedes alphabetically the letter of the second word in that 
position. 


(2) Graded lexicographic ordering, abbreviated as “glex” or “grlex” by 
many authors. As in Section 3 the total degree of a monomial X{' ---X in is 


'4This means a partial ordering with the properties that each pair a, b has a < b or b < a and 
that both hold only if a = b. 


494 VII. Background for Algebraic Geometry 


deg(X : --+ Xin) = \°7_, ix. The definition of the ordering is that M < ciex V 
if either deg M < deg N or else if deg M = deg N and M <,,y N. Thus for 
example, X A < GLEX XX ax és because the total degree 2 of the first monomial is 
less than the total degree 6 of the second monomial. But X;X5X} < GLEX * 1X3 
because both monomials have the same total degree 6 and the second monomial 
involves a higher power of X, than does the first. This monomial ordering is not 
much used; more common is the variant of it in the next example. 

(3) Graded reverse lexicographic ordering, abbreviated as “grevlex” by 
many authors. The definition is that MV < GREVLEX N if eitherdeg M < deg N or 
else if deg M = deg N and N’ <,,,, M', where M' is M but with the exponents 
of X; and X,,_ ; interchanged for each j, and where N’ is defined similarly. This 
ordering takes some getting used to. For example, X7X4 < GREVLEX *1* 2x2 


when n = 3 because both monomials have the same total degree and X}X3X3 = 
(X1X53X3)' Spy (X7X9)! = XT XG. By contrast, X1X3X3 <Gi py XTX}. 

(4) Orderings of k-elimination type, where 1 < k < n— 1. These are 
orderings such that any monomial containing one of X1,..., X; to a positive 
power exceeds any monomial in X;41,..., X, alone. These will be discussed 
in Section 10. Of them, one of particular importance is the Bayer-Stillman 
ordering of k-elimination type. Here a monomial M is < a monomial N if the 
sum of the exponents of X,,..., X; for M is less than the corresponding sum 
for N or else the two sums are equal and M <,ppy; py N. This ordering is 
commonly used for making computations in the context of Section 10. 


(5) Ordering from a tuple of weight vectors. For 1 < i <n, let w bea 


vector in R” of the form w = (w, ..., we), and assume that w, ..., w™ 
are linearly independent over R. Identify the monomial X% with the vector of 
individual exponents w = (a1, ..., @,). The ordering given by the weight vectors 


wi? is defined by saying that X° < X? if X* = X° or if the first i such that 
w).a@ Aw. Bhas w” -a@ < w . B. Here the dot refers to the ordinary dot 
product. A condition is needed on the w’s to ensure that 1 < X® forall a. (See 
Problem 14 at the end of the chapter.) Here are two specific examples for which 
the condition is satisfied. Let e be the i standard basis vector of R”. The 
lexicographic ordering in Example 1| is determined by the tuple of weight vectors 


(e),...,e™). The Bayer—Stillman ordering in Example 4 is determined by the 
tuple of weight vectors 
(cD 4... +e, 8D 4.2.4 0,-2.2), 2, 0), 


Further discussion of monomial orderings determined by weight vectors occurs 
in Problems 14—15 at the end of the chapter. 


Property (i) of monomial orderings insists that the ordering respect multipli- 


7. Grobner Bases 495 


cation of monomials in the natural way. Property (ii), according to the next 
proposition, is a well-ordering property. The proof of the proposition will be 
preceded by a lemma. 


Proposition 8.16. In any monomial ordering for K[X,..., X,], any decreas- 
ing sequence M, > M, > M; > --- is eventually constant. Consequently each 
nonempty subset of M has a smallest element in the ordering. 


Lemma 8.17. If / is an ideal in K[X,,..., X,] generated by monomials and 
if f(X1,..., X,) is in J, then each monomial appearing in the expansion of f 
with nonzero coefficient lies in 7. Consequently / has a finite set of monomials 
as generators. Moreover, if {M),..., Ms} is a set of monomials that generate / 
and if M is any monomial in J, then some M; divides M. 


PROOF. Let {M,} be the set of monomials that generates /. If f is in J, then 
we can write f = ee hj Mg, for polynomials hj. Let hj = Se cij Mj; be 
the expansion of /; in terms of monomials. If Mo is a monomial appearing in f 
with nonzero coefficient c, then the only possible monomial Mj; in h; that can 
contribute toward c is one with Mj; Ma, = Mo if such a monomial exists. For 
some j, such a monomial must exist, or c would be 0; thus Mp lies in /. 

For the second conclusion, write {fi,..., fi} by the Hilbert Basis Theorem. 
The first conclusion shows that each monomial contributing to each f; lies in 
I, and the set of all these monomials, as j varies, is therefore a finite set of 
monomials generating /. 

For the third conclusion, write M = )~;_, a;M; for polynomials a;. Expand- 
ing each a; in terms of monomials, we see that some a; contains with nonzero 
coefficient a monomial M’ such that M = M’M,;. The divisibility follows. 


PROOF OF PROPOSITION 8.16. Let M be a monomial, and let J be the linear 
span of all monomials M’ with M’ > M. If M' is a such a monomial and N is 
any monomial, then NM’ > NM by (i), and NM > 1M = M by (i) and (ii). 
Therefore N M’ lies in J, and / is an ideal. 

From such an ideal 7, we can recover M as the unique monomial Mo in J such 
that Mo < M’ for every monomial M’ in J, since any such Mo has Mo < M as 
wellas M < Mo. 

With M,, M2,... given as in the proposition, let J, be the linear span of all 
monomials M’ > M;,. We have just seen that J, is an ideal, and the J,’s are 
increasing ink. Then J = (J72, J is an ideal generated by monomials, and 
Lemma 8.17 shows that it has a finite set of monomials as a set of generators. 
Each such monomial generator lies in some J;. Since the J;,’s are nested, all the 
generators lie in some /;,, and we conclude that J = J. The previous paragraph 
of the proof shows that i, determines M;,, and therefore M, = M,, for all 
k > ko. 


496 VIII. Background for Algebraic Geometry 


For the last statement of the proposition, if there were no least element, then for 
any element in the subset, we could always find a smaller element in the subset. 
In this way, we would be able to construct a strictly decreasing infinite sequence 
in M, in contradiction to what has just been proved. 


Fix a monomial ordering for K[X,,..., X;,]. If f is any nonzero member of 
K[X1,..., Xn] and if f is expanded as a K linear combination of monomials, 
then we define the leading monomial, leading coefficient, and leading term of f 
by 


LM(f) = largest monomial with nonzero coefficient in expansion of f, 
Lc(f) = coefficient of LM(f) in f, 


LT(f) = LC(f) LM(f). 


It will be convenient to be able to use these definitions without having to dis- 
tinguish the cases f # 0 and f = 0. Accordingly, let us adjoin 0 to the set 
M, agreeing that 0 < M and 0M = 0 for every monomial M. We adopt the 
convention that LM(0) = 0, LT(O) = 0, and LC(O) = 0. 

Since any monomial that occurs in a sum of two polynomials occurs in one or 
the other of them, it is immediate from the definition that 


LM(f; + f2) < max(LM(/f1), LM(f2)) 


if fi, fo, and f; + f2 are nonzero. Checking the various cases, we see that this 
inequality persists if one or more of f|, fo, and f, + fo are 0. 

The comparable results concerning multiplication are contained in the next 
proposition. 


Proposition 8.18. If f; and f are two nonzero members of K[X,..., Xn], 
then 


LM(fi f2) = LM(f1) LM(f2) and LC(fi fo) = LC( fi) LC( fa); 


hence 
LT(fi fo) = LT(f1) LT(f2). 


These equalities persist if one or both of f; and f2 are 0. Moreover, if f; and fo 
are nonzero and have LT(f1) = LT(f2), then LM( fi — f2) < LM(/\). 


PROOF. For the first statement, let the expansions of f; and f> as linear 
combinations of distinct monomials be f; = a; LM(f1) + )0;¢;M; and fp = 
ay LM(f2) + yj d;N; with M; < LM(/;) for all i and N; < LM(f2) for all j. 
Then f; f2 equals 


a1a2 LM( fi) LM(f2) + a2 2) ci Mi LM( fa) + a1 dj LM( fi) Nj + D cidj MiNj, 
i J tJ 


7. Grobner Bases 497 


and the conclusions in the first sentence of the proposition will follow if it is 
shown that M; LM(f2) < LM(f1) LM(f2), that LM(f,)Nj < LM(f\) LM(f2), and 
that M,N; < LM(f\) LM(f2). The first inequality follows from (i) because Mj < 
LM(f1), and the second inequality is similar. For the third we apply (i) twice to 
obtain M; N; < M; LM(f2) < LM(fi) LM(f2) and observe that the end expressions 
can be equal only if equality holds in both instances. The latter is impossible 
because K[X,,..., X,] is an integral domain, and thus M; Nj < LM( fi) LM(/2). 

The three displayed equalities persist if one or both of f; and fz are 0 because 
LM(f), LT(f), and LC(f) can be 0 only if f = 0. 

Finally if f; and f are nonzero and have expansions as in the first paragraph of 
the proof with LT(f,) = LT(f2), thenLC(f;) = a; andLC(f2) = a2. Hence fi — fo 
has an expansion involving only the monomials M; and N;. Consequently if 
fi—f2 9, then the largest of the M;’s and N;’s is < LM( fi). Thus LM(fi— f2) < 
LM(f1). This inequality holds also if fi — fo = 0. 


If J is a nonzero ideal in K[X,,..., X,], we define LT(/) to be the vector 
space of all K linear combinations of polynomials LT(f) with f in J. It fol- 
lows from Proposition 8.18 that K[X,,..., X,]LT(/) © LTU), and therefore 
LT(Z) is an ideal in K[X,..., X,]. A finite unordered subset {g),..., gx} 
of nonzero elements of the ideal J is called a Grébner basis of J if LT(/) = 
(LT(g1), sha 5 LT(gx)). The inclusion D follows from the definition, and the 
question is whether LT(g1), ..., LT(gx) generate LT(/). 

Among the examples below, Example 3 is particularly suggestive of the utility 
of a Grobner basis. The idea is that an ordinary set of generators may have 
the property that certain “small” elements of J can be expanded in terms of the 
generators only using “large” coefficients and that this property is reflected in the 
failure of (LT(g1), ..., LT(g,)) to exhaust LT(/). 


EXAMPLES WITH LEXICOGRAPHIC ORDERING. 


(1) Principal ideal. If 7 = (f(X1,..., Xn)), then {f} is a Grobner basis. In 
fact, the most general member of J is of the form hf with h in K[X,,..., Xn], 
and Proposition 8.18 gives LT(Af) = LT(h) LT(f). Therefore LTV) = (LT(f)), 
as required. 


(2) Ideal generated by members of K[X1,..., Xn]i. Suppose that J = 
(L1,..., Lx), where each L; isa homogeneous linear polynomial of degree 1. For 
example, J could be (X; + X2 + X3, X; — X3). Let us form the corresponding 


: +) in the 3-variable example. If 
we perform row operations to transform this matrix into reduced row-echelon 
form and let L},..., Lj, be the members of K[X1,..., Xn] corresponding to 


the reduced matrix, specifically X¥; — X3 and X2 + 2X3 for the reduced form 


k-by-n coefficient matrix, specifically ( 


498 VII. Background for Algebraic Geometry 


(; ' a of fe : a); then J = (L},..., L;,) and moreover {L/,..., Li, } isa 
Grobner basis of J. This fact is not particularly obvious in the full generality of 
this example, but it will be shown to be an easy consequence of Theorem 8.23 in 


the next section. 


(3) Earlier example in this section. In K[X, Y], let J = (fi, f2) with fi(X, Y) 
= X? + 2XY? and fo(X,Y) = XY +2Y¥3—1. Then (LT(f1), LT(f2)) = 
(X?, XY). We saw that X is a member of J and that LT(X) = X is not in 
(LT(fi), LT(f2)). So {f1, fo} is not a Grébner basis. If we enlarge the set 
of generators of J to {f1, fo, X}, then we still do not have a Grébner basis 
because fo — YX = 2Y? — 1 is in J and LT(f2 — YX) = 2Y°? does not lie 
in (LT(f\), LT(f), LT(X)) = (X?, XY, X) = (X). We can enlarge the set of 
generators still further to {f,, fo, X,2Y 3 _ 1}. Is this a Grobner basis? Here 
we have (LT(fi), LT(f2), LT(X), LT(2Y? — 1)) = (X, Y*), and it seems as if this 
equals LT(/). But we need a way of checking easily. We shall obtain a way of 
checking in Theorem 8.23 in the next section. 


The question of existence—uniqueness of a Grobner basis will be addressed 
constructively in Sections 8-9; however, we did observe at the beginning of this 
section that Hilbert’s proof of the Hilbert Basis Theorem essentially handles exis- 
tence when the monomial ordering is the usual lexicographic ordering. Actually, 
the argument at the beginning of the section had two parts to it—a nonconstructive 
argument producing a certain finite set of leading terms and a verification that 
those leading terms lead to a set of generators of the ideal. The first part, being 
a nonconstructive existence proof, does not help us in our current efforts, and 
we defer to Problem 13 at the end of the chapter the question of adapting it to 
a general monomial order. The second part, on the other hand, is a useful kind 
of verification in our current efforts. It shows that a certain kind of finite subset 
of an ideal is necessarily a set of generators, and it generalizes as follows. The 
generalization will play a role in Section 9. 


Proposition 8.19. If K is a field, if a monomial ordering is specified for 
K[X1,..., Xn], and if {g1,..., g¢} is a Grobner basis for a nonzero ideal J of 
K[X1,..., Xn], then {g1,..., gg} generates /. 


PROOF. First we prove that if f 4 0 is in /, then there exist a g;, a monomial 
Mo, and anonzero scalar c such that LM(f —cMog;) < LM(f). To see this, we use 
the hypothesis that {g1, ..., gx} is a Grébner basis to find polynomials hy, ..., hz 
such that LM(f) = ban h; LM(g;). Then it must be true for i equal to some 
index j that LM(f) = MoLM(g;) for one of the monomials Mo that appears in 
hj with nonzero coefficient. Since Mo LM(g;) = LM(Mo) LM(g;) = LM(Mog;), 
we can rewrite this equality as LT(f}) = c LT(Mog;) for some scalar c 4 0. Then 


8. Constructive Existence 499 


LT(f) = LT(cMog;), and Proposition 8.18 shows that LM( f —cMogj) < LM(f), 
as asserted. 

Iterating this construction and assuming that we never get 0, we can find 
successively nonzero scalars c;, monomials M;, and members gj, of the Grébner 
basis such that the sequence LM(f — ys cjMjg;,) indexed by / is strictly 
decreasing, in contradiction to Proposition 8.16. To avoid the contradiction, we 
must have f — ye cj; Mjg;, = 0 for some /, and then f is exhibited as in the 
ideal (g1,..., gx). Hence the Grébner basis generates J. 


8. Constructive Existence 


Throughout this section, K denotes a field, and we work with a fixed monomial 
ordering on K[X1,..., Xn]. Idealsin K[X1, ..., Xn] will always be specified by 
giving finite sets of generators. Our objective is to obtain a constructive proof of 
the existence of a Grobner basis for each nonzero ideal in K[X1,..., Xn], along 
with a useful test procedure for deciding whether a given finite set of generators of 
I is a Grébner basis. As is often the case with existence proofs, the motivation for 
the proof comes from a certain amount of deduction of properties that a Grdbner 
basis must satisfy if its exists. It was mentioned in the previous section that the 
failure of a set of generators to be a Grobner basis has something to do with 
its failure to be able to represent all “small” elements of the ideal by means of 
expansions in terms of the generators that use “small” coefficients. The first part 
of this section will explore this idea, seeking to make it precise. The main step 
will be a checkable text for a set to be a Grdbner basis; this is Theorem 8.23. 
The existence argument will be an easy corollary. A by-product of the existence 
argument will be a way of testing a polynomial for membership in /. 

In the one-variable case any ideal is principal, necessarily of the form (g(X)), 
and the test for membership of a polynomial f in the ideal is to apply the division 
algorithm, writing f(X) = q(X)g(X) + r(X) with r = 0 or degr < degg. 
Then f is a member of the ideal if and only ifr = 0. The starting point for the 
several-variable theory is to do the best we can to generalize the division algorithm 
to several variables, recognizing that we cannot expect too much because of the 
complicated ideal structure in several variables. 


Proposition 8.20 (generalized division algorithm). Let (fi, ..., fs) bea fixed 


enumeration of a set of nonzero members of K[X,,..., X,], and let f be an 
arbitrary nonzero member of K[X,,..., X,]. Then there exist polynomials 
a\,...,@s andr such that 


f=afit---tasfs +r, 


500 VII. Background for Algebraic Geometry 


such that LM(a; f;) < LM(f) for all j, and such that no monomial appearing in r 
with nonzero coefficient is divisible by LM(f;) for any j. 


REMARK. The proof below will stop short of giving an algorithm, because 
omitting the details of the algorithm will make the invariant of the construction 
clearer. To make the proof into an algorithm, one merely needs to be systematic 
about the choices in the proof. There is no claim of any uniqueness of a1, ..., ds 
or r in the statement; in fact, Problem 16 at the end of the chapter shows that 
more than one kind of nonuniqueness is possible. Corollary 8.21 below, however, 
will show that if the given f,..., f; form a Grébner basis of an ideal J, then 
r is independent of the enumeration of the Grébner basis, even without the 
requirement that LM(a; f;) < LM(f) for all /. 


PROOF. We shall do a kind of induction involving decompositions of f of the 
form 
f=HQfit--+4sfs)+p +r, () 
where aj,..., ds, p,r are polynomials with the properties that 
(i) LM(p) < LM(f), 
(ii) LM(q; fi) < LM(f) for all 7, 
(iii) no monomial M appearing inr with nonzero coefficient has M divisible 
by any LM(fi), 
and we shall demonstrate that LM(p) decreases at every step of the induction as 
long as p # 0. Initially we take all aj = 0, p = f, andr = 0. Then («) and the 
three properties hold at the start. Let us describe the inductive step. 
If LT(f;) divides LT(p) for some j, then we replace a; by aj + LT(p)/LT(fj), 
we change p to p — (LT(p)/ LT(f;)) fj, and we leave r alone. The equality (*) 
is maintained, and (iii) continues to hold. Since 


LT ((LT(p)/LT(f;)) f;) = LT (LT(p)/ LT(fj)) LTA) 


(4%) 
= (ir(p)/UT(f)) LIF) =U), 


Proposition 8.18 shows that LM(p) strictly decreases. Consequently (i) continues 
to hold. By the same kind of computation as for (*), 


LM ((a; + LT(p)/LT(f;)) fj) < max (LM(q; fj), LM (LT(p)/LT(f;)) fj) 
< max(LM(f), LM(p)) = LM(f), 


and therefore (ii) continues to hold. This completes the inductive step if LT( fj) 
divides LT(p) for some j. 

The contrary case is that LT(p) is divisible by LT(f;) for noi. Then we replace 
p by p — LT(p), we change r to r + LT(p), and we leave all a; alone. The 


8. Constructive Existence 501 


equality (+) is maintained, and (ii) continues to hold. Since LM(p) = LM(LT(p)), 
Proposition 8.18 shows that LM(p) strictly decreases. Consequently (i) continues 
to hold. Also, (iii) continues to hold because of the assumption that LT(p) is 
divisible by LT(f;) for no i. This completes the inductive step if LT(p) is divisible 
by LT(fi) for no i. 

Proposition 8.16 shows that the induction can continue for only finitely many 
steps. Since it must continue as long as p 4 0, the conclusion is that p = 0 after 
some stage, and then the decomposition of the proposition has been proved. 


Corollary 8.21. If {g,..., gs} is a Grébner basis of a nonzero ideal J of 
K[X,,..., X,] and if f is any nonzero member of K[X,,..., X;,], then there 
exist polynomials g andr such that f = g+r, g isin J, and no monomial appear- 
ing inr with nonzero coefficient is divisible by LM(g;) for any j. Moreover, r is 
uniquely determined by these properties, and g has an expansion g = )-}_, 48; 
with LM(qa;g;) < LM(f) for all i. 


REMARKS. The uniqueness statement implies in particular that r is independent 
of the enumeration of the set {g1, ..., gs}. This corollary will give us some insight 
into the way a Grébner basis can resolve cancellation. Shortly we shall introduce 
specific members of J that have cancellation built into their definition. Being in 
I, they have expansions with remainder term 0, according to this corollary. Since 
the remainder is unique, the corollary says that they can be rewritten in terms of 
the Grébner basis in a way that eliminates the cancellation. 


PROOF. For existence, let {g),..., g;} be a Grobner basis of 7, and apply 
Proposition 8.20 to f and the ordered set (g;,..., gs). Then the existence follows 
immediately. 


For uniqueness, suppose that f = gi +71 = g2 +12. Thenr) —r2 = g2 — 21 
exhibits r; — rp as in J. Arguing by contradiction, suppose that r; 4 r2. The 
hypothesis on r; and rz shows that no monomial with nonzero coefficient in 
r| — rz is divisible by any LM(g;), and in particular LM(r; — rz) is not divisible 
by any of the generators of the monomial ideal (LM(g1), ee LM(g;)) = LM(/). 
Since LM(r; — rz) is a monomial in this ideal, this conclusion contradicts the last 
conclusion of Lemma 8.17. 


Suppose that X* = X$'--- X= and ea be . xe are two monomials in 
K[X\,..., Xn]. Then we define their least common multiple LCM(X”, X*) to 
be 


LCM(X", X?) = XY =X"... Xh with y; = max(q;, 6;) for all j. 


This notion does not depend on the choice of a monomial ordering. Observe 
for any two monomials M and N that LCM(M, N)/M and LCM(M, N)/N are 
monomials. 


502 VII. Background for Algebraic Geometry 


If f, and f> are nonzero polynomials, then the expression 


LCM(LM(fi),LM(f2)) , — LCM(LM(fi), LM(f2)) fi 
LT(fi) i LM(f1) LC(fi) 


is a polynomial whose leading monomial is LCM( LM(f1), LM( fr) and whose 
leading coefficient is 1. We define the S-polynomial of f; and f> to be 


LCM(LM(fi),LM(f2)) ,  LCM(LM(fi), LM(f2)) 
Lt(fi) LT f2) 


S(fi, fo) = fr 


This is the difference of two polynomials with the same leading monomial 
LCM(LM( Fi), LM( fr) and with the same leading coefficient 1. Accordingly, 
Proposition 8.18 shows that 


LM(S(fi, f2)) < LCM(LM(fi), LM(f2)). 


The elements S(f;, f2) are the elements mentioned in the remarks with Corollary 
8.21; the above inequality is a precise formulation of their built-in cancellation. 

Lemma 8.22 below says that whenever cancellation of this kind occurs in 
any sum of products with functions fi,..., fs, then the sum of products can be 
rewritten in terms of the S-polynomials S(f;, f,). In this way the nature of the 
cancellation has been made more transparent, partly being accounted for by the 
definitions of the individual polynomials S( fj, fx). 


Lemma 8.22. Let M and M,,..., M, be monomials, let f;,..., ff, be nonzero 
polynomials, and suppose that Mj LM( fi) = M foralli. Ifci,..., cs are constants 
such that LM ( 1 Mi fi) < M, then the sum )>;_, c; M; f; can be rewritten 
in the form 


s di,M 
iMi fi = : S(fj; 
Dee J 2 TEM(Lm( fmf) (Sir Si) 


j<k 


for suitable constants d;,. In the sum on the right side, each nonzero term has 
leading monomial < M. 


PROOF. Let us write Lj; = LCM(LM(f;), LM(fj)) fori ~ 7. We may assume 
that all the c; are nonzero, and we proceed by induction on s. There is nothing to 
prove for s = 1. The key step is s = 2, for which we are given that the M term 
of cy M, f; + c2M2 fr is 0, i-e., that 


ci LC(fi) + c2 LC(f2) = 0. (*) 


8. Constructive Existence 503 


Substituting for LC( fz) from («) gives 


MLY S(fi, fo) = Mfi/ Lf) — Mfo/LT(f2) 
= M, fi /Lc( fi) — M2 fo/ LC( fo) 
= cp! Lc(fi) (eM fi + c2M2 fr), 
and this proves the displayed formula of the lemma with d\2 = c; LC(f1). 


Assume the result for s — 1 > 2. We are given that ae c; LC(f;) = 0, which 
we break into two parts as 


ci Lc( fi) — Par Le( fh) = 0, 


(c: + uae) LC(f2) + > c; LC(f;) = 0. 


The inductive hypothesis gives 


aM fi — Took Mp fr = diMLa S(fi, fo); 


(c2 + Sece) Mo fs + OM = dix ML S(fj. Sx): 
= k 


25j< 


Adding these two formulas, we obtain the displayed formula of the lemma for 
the case s, and the induction is complete. 


Theorem 8.23. Let {g1, ..., gs} be a set of generators of a nonzero ideal J of 
K[X1,..., Xn], and assume that g; 4 0 for all i. Then the following conditions 
on {g1,..., gs} are equivalent: 


(a) {g1,..., gs} is a Grébner basis of J, 
(b) for each pair (g;, gx) with S(g;, gx) A 0, every expansion of S(g;, gx) as 
S(gj, 8k) = bar dijxgi +r with the two properties that 
(i) LM(aijxgi) < LM(S(g;, gx)) and 
(ii) no monomial appearing in r with nonzero coefficient is divisible 
by LM(g;) for any j 
hasr = 0, 
(c) for each pair (g;, gx) with S(g;, g,) 4 0, there is an expansion of the 
form S(gj, 8k) = Dij=1 @ijk8i with LM(a;jx87) < LM(S(g;, 8&))- 


REMARKS. Because of the equivalence of (b) and (c), the generalized divi- 
sion algorithm (Proposition 8.20) gives us a procedure for testing whether these 
conditions are satisfied by {g1,..., gs}. Namely we follow through the steps in 
the proof of Proposition 8.20 in whatever fashion we please for each nonzero 


504 VII. Background for Algebraic Geometry 


S(g;, gx). If we get remainder r = 0 for each pair (j, k), then the conditions are 
satisfied. If we get a nonzero remainder r for some pair, then the conditions are 
not satisfied. In view of the equivalence of (a) with these conditions, we have an 
effective (though somewhat tedious) way of checking whether {g1,..., gs} is a 
Grobner basis. 


PROOF. We prove that (a) implies (b) and that (c) implies (a). Since (b) 
certainly implies (c), the proof will be complete. 

Let (a) hold, ie., let {g1,..., gs} be a Grobner basis. If S(g;, gx) A 0, then 
S(g;, 8) iS a nonzero member of J because each g; lies in J, and S(g;, gx) 
consequently has an expansion as )~;_, a;g; +r withr = 0. By Corollary 8.21 it 
has a possibly different expansion withr = 0 and with LM(a;g;) < LM(S(g;, gx)) 
for each i. On the other hand, in any expansion of S(g;, gx) as ay ajgi +r 
such that (ii) holds, whether or not LM(a;g;) < LM(S(g;, gx)), r must be 0 by 
Corollary 8.21. This proves (b). 

To prove that (c) implies (a), we argue by contradiction. Among all expan- 
sions of members of J as >;_, big; such that LT (}>;_, big;) is not in the ideal 
(LT(g1), Say LT(gs)), choose one for which 


M = max LM(bjg;) 
l<i<s 
is as small as possible; this choice exists by Proposition 8.16. For this choice, let 
AY 
f => digi. (*) 
i=l 


Define M; = LM(b;) for each i with b; #4 0. If ig is an index with M = 
LM(b;, 8i,), then M = M;, LM(g;,) by Proposition 8.18, and hence M lies in 
(LT(gi),...,LT(gs)). Since LT (>°}_, big;) is not in (LT(g1),...,LT(gs)), it 
follows that LT ( SS bigi) < M. Within the set {1,...,s}, define a subset E to 
consist of those i for which M; LM(g;) = M. This set contains ig, and it has the 
property that all i not in E have LM(b;g;) < M. We regroup f as 


f=Dd bert ¥ big; = Lc) Mig: +  (b; — 17) g: + Y bigi. 
igE idE 


ick ick ick 


Every term in the second and third sums on the right side has leading monomial 
< M, and so does f. Therefore LM (ae Lc(b;) Mj gi) < M. It follows that 
the expression ));.<, LC(b;) Mig; is of the form considered in Lemma 8.22 with 
cj = LC(b;) fori € E (andc; = 0 fori ¢ FE). The lemma tells us that 


Yd Lc(bi)) Migi = do djx(M/L jx) S(gj, gx) 
icE dk 


8. Constructive Existence 505 


for suitable scalars dj,, where Lj, = LCM( LM(g;), LM(gx)). 
Now we apply the hypothesis (c), expanding each S(g;, gx) in some way as 
S(8j, 8) = ey di jkgi With the a; ;, equal to polynomials such that 


LM(qjjxi) < LM(S(gj, gx))- (*) 
Substituting for S(g;, g,), we obtain 
f= DY djx(M/Ljx)aijngi + Vo (bi — Lt(bi)) gi + DY bi8i. (+) 
ijk icE igE 


We know that every term in the second and third sums on the right side of (+) 
has leading monomial < M, and we shall estimate the leading monomial of each 
term in the first sum. Multiplying the inequality 


LM(S(gj, 8k)) < LCM(LM(g;), LM(gu)) = Lj 
by the monomial M/L jx yields 
(M/L jx) LM(S(gj, 8x)) < M (+7) 
for every pair (j, k). Combining (*) and (+) gives 
LM ((M/Ljx)aijegi) = (M/L jx) LM(Gijegi) < (M/Ljx) LM(S(g;, 8x)) < M. 


Since each djx is a scalar, every term in the first sum on the right side of (+) 
has leading monomial < M. Thus (7) is an expansion of a member of / that 
contradicts the minimality of max; LM(b;g;) in the expansion (*). From this 
contradiction we conclude that (a) holds. 


EXAMPLE OF A VERIFICATION THAT A SET IS A GROBNER BASIS. This example 
continues Example 2 of “Examples with lexicographic ordering” in the previous 
section. A nonzero ideal J is generated by members of K[X,,..., Xn]; of the 
form (Lj,..., Ls), where each L; is a linear combination of X,,..., X,. After 
initial manipulations we assume that the matrix of coefficients of L;,..., Ls is in 
reduced row-echelon form. The assertion is that {L;,..., Ls} is then a Grébner 
basis of J. To prove this, we write L; = Xnj + 1;, where Xn; is the associated 
corner variable and /; is a linear combination of Xnjtls-++> Xn such that the 
coefficient of each corner variable is 0. If j < k, then 


S(Lj, Lk) = Xn, +4 Xn, = lk (Xn, +1) + (Xn, + lk) = le hy + UL. 
The second term on the right side contains no variable X;,..., X nj but the first 
term on the right side contains X,,._ Therefore, relative to the lexicographic 
ordering, we have LM (S(L;, Ly) = LM(—/,L;) = LM) Xn,. Consequently 
LM(; Ly) < LM (S (Lj, Lx)) (and actually strict inequality must hold). Thus the 
displayed formula shows that S(L;, Ly) = a,L; + aL, in the form demanded 
by (c) of Theorem 8.23. Since (c) implies (a) in the theorem, {Z1,..., Ls} isa 
Grobner basis of J. 


506 VII. Background for Algebraic Geometry 


Corollary 8.24 (Buchberger’s algorithm).!°> Each nonzero ideal in the poly- 
nomial ring K[X,,..., X;,] has a Grobner basis. Such a basis can be obtained by 
the following procedure: Start from any set {f;,..., f;} of nonzero generators, 
apply the generalized division algorithm in some fashion to each S( fj, fx) and 
to the generating set { fi, ..., f;}, and adjoin to the set of generators any nonzero 
remainders obtained from this process. Iterate this process for enlarging a set 
{ Fi odsiaey fi} of generators as long as a nonzero remainder is obtained for some 
S(f j , f,). This process must terminate at some point with all remainders equal 
to 0, and the resulting generating set is a Grébner basis. 


Proor. At the stage of the iteration that works with the set {f/,..., f/,} of 
generators, any nonzero remainder r that arises has the property that no monomial 
occurring in r is divisible by any LM(f, j ). By Lemma 8.17, LT(r) is not a member 
of (LT( 5 i Pacer Bt f))). However, at the next stage when r has been designated 
as one of the generators of J, LT(r) has become one of the generators of this 
ideal. Therefore the ideal (LT( fC6 ren Ge )) strictly increases as we pass from 
one stage to the next. Since K[X,,..., X;,] is Noetherian, its ideals satisfy the 
ascending chain condition, and this chain of ideals must stabilize. Consequently 
all the remainders must be 0 at some point, and then Theorem 8.23 shows that 
the set of generators is a Grobner basis. 


EXAMPLE OF THE COMPUTATION OF A GROBNER BASIS. We return to Example 
3 of “Examples with lexicographic ordering” in the previous section. In K[X, Y], 
we let fj(X, Y) = X7 + 2XY? and fo(X, Y) = XY + 2Y°? — 1, and we define 
I = (fi, f2). We seek a Grobner basis of J, using the lexicographic ordering. 
Direct computation gives S(f, f2) = Y(X* +2XY*)— X(XY+2Y?—-1) =X. 
Since X is not divisible by LM(f1) or by LM(f2), S(fi, fo) = Of: + Ofo + X 
is an expansion of S(f;, f2) as in Theorem 8.23c with r = X. The procedure 
of Corollary 8.24 says to adjoin f; = X to the generating set and test again. 
Direct computation gives S(f,, f3) = 1(X? + 2XY*) — X- X = 2XY, and 
S(fi, fa) = Of; + OF) + 2Y) fs + 0 is an expansion of S(f;, f3) as in (c), 
since LM(2Y f3) < LM (S(fi, f)). Thus S(f;, f3) gives us a 0 remainder, hence 
nothing new to process. In addition, we have S(f2, f3) = 1(XY + oY? = 1) 
Y-X = 2Y?—1. No term of this is divisible by any of the leading monomials of 
Si. fo, fa,namely X 2 XY, X. Hence 2Y3—1isanonzeroremainder.!® Therefore 
we are to adjoin fa = 2Y 3 _ | to our set. Computation gives S(fi, f4) = 
2XY¥4+4+ X? = 2Y44+ X)fy Sf. fy) = 2° — VY? + 4X = 5A +V fs, 


'5Computer programs typically use an improved version of this algorithm to compute Grobner 
bases. 

'6Tt was not a bad choice of decomposition that led to a nonzero remainder when some other 
decomposition might have given us 0; the equivalence of (b) and (c) in Theorem 8.23 assures us of 
that fact. 


8. Constructive Existence 507 


and S(f3, f4) = 5X = 5 J3. In every case each term has leading monomial at 
most the leading monomial of the S-polynomial. Hence all remainders are 0, and 
Corollary 8.24 says that {f1, fo, fs, f4} is a Grobner basis of J. 


Corollary 8.25 (solution of the ideal-membership problem). If J is a nonzero 
ideal in K[X1,..., Xn] and f is a polynomial, then a procedure for deciding 
whether f lies in J is as follows: introduce a monomial ordering, construct 
a Grébner basis {g1,..., gs} of J by means of Corollary 8.24, and apply the 
generalized division algorithm to write f = )°;_,a;g; +r for polynomials 
a,,...,a,,r such that no monomial appearing in r with nonzero coefficient is 
divisible by LM(g;) for any j. Then f lies in J if and only ifr = 0. 


PROOF. Corollary 8.24 produces the Grobner basis, and Corollary 8.21 affirms 
that this procedure decides whether f lies in /. 


Corollary 8.26 (solution of the proper-ideal problem). If J is a nonzero ideal 
in K[X1,..., Xn], then a procedure for deciding whether 7 = K[X1,..., Xn] 
is to compute a Grébner basis for J and to see whether one of its members is a 
nonzero scalar c. 


PRooF. If / has a nonzero scalar as one of its generators, then | lies in /, 


and hence J certainly equals K[X,,..., X,]. Conversely if J is given, then 
Corollary 8.24 produces a Grébner basis {g1, ..., gs}. Since LT(1) = 1 and since 
LTV) = (LT(g1), shes LT(g;)), the monomial | must lie in (LT(g1), sored LT(g;)). 


Since | is a monomial, Lemma 8.17 shows that it must be divisible by LM(g;) 
for some j. Therefore LM(g;) = 1. Since | is the smallest monomial in any 
monomial ordering, it is the only monomial appearing with a nonzero coefficient 
in g;. Therefore g; is a nonzero scalar. 


In many applications of Grobner bases, there is some flexibility in what mono- 
mial ordering to impose in obtaining the Grobner basis. In Corollaries 8.25 and 
8.26, for example, absolutely any monomial ordering works fine. The actual 
calculation of Grobner bases is often computationally demanding, and thus it 
is worthwhile to use such a basis that takes relatively little time to compute. 
According to computer scientists,!’ Grébner bases are the most widely useful 
when computed relative to the lexicographic ordering, but they are then also 
the most time-consuming to compute. The monomial orderings that make the 
computation of Grdbner bases proceed quickly tend to be ones that first bound 


'7The Web essay “Representation and monomial orders,” http: //magma.usyd.edu/au/ 
magma/handbook/1177, within the documentation of the Magma computer algebra system 
at the University of Sydney contains a discussion of various monomial orders and their uses and 
advantages. 


508 VII. Background for Algebraic Geometry 


the total degree in one or two steps. One of the reasons that this kind of monomial 
ordering works so efficiently is that once the total degree is bounded, there are 
only finitely many monomials less than any given monomial M. 


9. Uniqueness of Reduced Grébner Bases 


In this section, K continues to denote a field, and we work with a fixed monomial 
ordering on K[X1,..., X,]. Ideals in K[X1,..., X,] will always be specified 
by giving finite sets of generators. Our objective in this section is to show how 
any Grobner basis can be “reduced” and that a “reduced” Grébner basis for an 
ideal is unique. A by-product of the uniqueness argument will be a way of testing 
two ideals for equality. 


Any finite set of generators of J that contains a Grobner basis is again a Grobner 
basis. Thus a constructed Grébner basis will often be unnecessarily large. One 
simple kind of redundance is addressed by Lemma 8.27 below. 


Lemma 8.27. If {g1,..., gs} is a Grébner basis for a nonzero ideal J in 
K[X,...,X,] and if LM(g;) lies in the ideal (LT(g2),...,LT(g;)), then 
{go,..-, 8s} is a Grébner basis of J. 


REMARK. Lemma 8.17 shows how to check whether LM(g1) lies in the ideal 
(LT(g2), ee LT(gs))5 all we have to do is see whether some LM(g;) for j > 1 
divides LM(g}). 


PRooF. By hypothesis, (LT(g2),..., LT(gs)) =(LT(g1), --.,LT(gs)) =LT(). 
Therefore {g2,..., gs} is a Groébner basis of 7. (Recall that the definition of 
Grébner basis does not assume that the set generates the ideal; Proposition 8.19 
deduces that it generates.) 


A Grobner basis {g;,..., gs} of a nonzero ideal J is said to be minimal if 
LC(g;) = 1 for all j and if no LM(g;) is divisible by LM(g;) for some j F i. 
Lemma 8.27 shows that in trying to transform a Grébner basis into a form for 
which a uniqueness result will apply, there is no loss of generality in assuming 
that the given Grobner basis is minimal. 


EXAMPLE. As in the example following Corollary 8.24, let J be the ideal in 
K[X, Y] given by I = (fj, fo) with fi(X, Y) = X* + 2XY? and fo(X, Y) = 
XY +2Y?— 1. Then we saw that {fi, fo, f3, fa} is a Grobner basis of J in 
the lexicographic ordering, where f3(X,Y) = X and fa(x,Y) = QV? =1, 
The leading monomials are LM(fi) = a Ge LM(f2) = XY, LM(f3) = X, and 
LM(f4) = Y°. The first two are divisible by the third. Therefore {X, Y* — 5} is 
the corresponding minimal Grébner basis. 


9. Uniqueness of Reduced Grobner Bases 509 


Unfortunately an ideal can have more than one minimal Grobner basis, as is 
shown in Problem 17 at the end of the chapter. A Grobner basis {g1,..., g;} of 
an ideal / is said to be reduced if it is minimal and if for each 7, no monomial 
appearing in g; with nonzero coefficient is divisible by LM(g;) for some j #7. 


Theorem 8.28 (uniqueness of reduced Grobner basis). If J is a nonzero ideal 
in K[X1,..., Xn], then J has a unique reduced Grébner basis, and this can be 
obtained algorithmically starting from any minimal Grobner basis. 


PROOF OF UNIQUENESS. Let {g1, ..., gs} be any Grdbner basis. Since LTV) = 
(LT(g1), aes Er(e2))s Lemma 8.17 shows that any LM(f) for f € J is divisible by 
LM(g;) for some j. If {h1,..., H+} is a second Grébner basis, then this argument 
shows that each LM(h;) is divisible by some LM(g;). Turned around, the argument 
shows that LM(g;) is divisible by some LM(hx). Since {h,,..., h;} is assumed 
minimal, LM(/;) cannot be divisible by LM(h;) if i 4 k. Thus LM(h;) = LM(hx), 
and these equal LM(g;). Then it follows that s = t and that we may enumerate 
any two minimal Grébner bases in such a way that the leading monomial of the 
i” member of each basis is the same for each i with 1 <i <s. 

With this normalization in place, let us show that g; = h;. To do so, we expand 
gi —hj as gi —hj = ae ajhj with LM(g; — h;) = max; LM(a;h;) in accordance 
with (b) of Theorem 8.23. Choose k such that the maximum on the right side is 
attained at k, i.e., such that 


LM(ax) LM(hx) = LM(g; — hij). (*) 


Arguing by contradiction, suppose that the right side of () is nonzero. Then it 
must be a monomial occurring in either g; or h;. Since the two Grébner bases are 
reduced, no monomial occurring in g; is divisible by LM(g,) = LM(hx) if k 4 i, 
and similarly for monomials occurring in h;. We conclude that k = i and that 
LM(h;) = LM(g; — h;). But this is impossible by Proposition 8.18 if g; —h; 4 0, 
since LM(g;) = LM(h;) and LC(g;) = LC(h;) = 1. Therefore the right side of («) 
is O, and g; = hj. 


PROOF OF EXISTENCE. Let {g1,..., gs} be a minimal Grobner basis of J. As 
was shown in the proof of uniqueness, the leading monomials LM(g1), ..., LM(gs) 
are independent of the choice of the actual minimal basis. Looking at the definition 
of “reduced,” we see therefore that the property of being reduced is a property of 
each member g; of the basis separately. That is, it is meaningful to say that g; 
is reduced if no monomial appearing in g; with nonzero coefficient is divisible 
by LM(g;) for some j 4 i. We shall show how to replace g; by an element g; 
with the same leading monomial in such a way that the new set is still a Grébner 
basis and g/ is reduced, and then the proof will be complete. There is no loss of 
generality in taking i = 1. 


510 VII. Background for Algebraic Geometry 


Applying the generalized division algorithm (Proposition 8.20), we write 


gi= diag tr (#*) 
j=2 
in such a way that 
LM(gi) = ES LM(4j8;) @) 


and that no monomial appearing in r with nonzero coefficient is divisible by 
LM(g;) for any j > 2. If we define g/ to be this element r, then the element g' 
is reduced in the above sense, and the only question is whether {g/, g2,..., 8s} 
is a Grobner basis. Since {g1,..., gs} is minimal, LM(g1) is not divisible by any 
LM(g;) for j > 2. Consequently LM(g1) appears with nonzero coefficient on the 
left side of (**), and it does not appear in any of the terms a;g; with nonzero 
coefficient on the right side. Consequently it appears inr = g', and LM(g\) < 
LM(gi). On the other hand, the equality (+) implies that LM(g,) < LM(g1). 
Therefore LM(gi) = LM(gi), and LT”) = (LT(gi), LT(g2)...,LT(gs)) = 
(LT(g4), LT(g2)..., LT(g;)). Consequently {g), g2,..-, 8s} is a Grobner basis 
by definition. 


Corollary 8.29 (solution of the ideal-equality problem). Let J and J be two 
nonzero ideals in K[X,,..., X,] specified in terms of finite sets of generators. 
Then / = J if and only if the reduced Grobner bases of J and J relative to a 
single monomial ordering are the same. 


REMARK. As with the solution of problems listed in Corollaries 8.25 and 8.26, 
the desired end is independent of the monomial ordering, and in practice one 
might just as well start from a monomial ordering for which the computation of 
Grobner bases is relatively easy. 


PROOF. This result is immediate from Corollary 8.24 (constructive existence 
of Grobner bases) and Theorem 8.28. 


10. Simultaneous Systems of Polynomial Equations 


In this section we combine our techniques concerning the resultant and Grébner 
bases to attack the original problem discussed in Section 1, that of solving systems 
of simultaneous polynomial equations in several variables. Our interest ultimately 
will be in the case that the underlying field is algebraically closed. 

Corollary 8.26 and the Nullstellensatz already combine to give a criterion for 
such a system to have no solutions: We regard the system as the zero locus of 
an ideal, and we calculate a Grébner basis for the ideal. Then the system has no 


10. Simultaneous Systems of Polynomial Equations S11 


solutions if and only if the Grdbner basis contains a constant polynomial, i.e., if 
and only if the reduced Grébner basis is {1}. 

Let us now consider the problem of finding the solutions when solutions exist. 
We begin with the case of two equations in two unknowns over the field C, 
recalling what we know from the theory of the resultant. Consider the system 


X*Y+Y*%=5, 
XY =2. 


Set f(X, Y) = X*Y+Y*—S5 and g(X, Y) = XY —2. To find points (x, y) with 
I(x, y) = g(x, y) = 0, using the style of Sections 1-3, we compute the resultant 
of f and g in the X variable, say, and obtain the polynomial Y* — 5Y? + 4Y. 
Setting this equal to 0 gives us y = 0, y = 1, and y = s(-1 +/17). We can 
then substitute each such y into x?y+ y? = 5 and get candidates (x, y). Doing so 
for y = 0 gives us no candidates, and doing so for each of the other three values 
of y gives us two values of x, differing only in a sign. So we get six pairs (x, y). 
However, only three of these satisfy the second given equation, xy = 2, one for 
each nonzero value of y. Thus the resultant gives us a handle on the problem of 
finding solutions, but it has two shortcomings: it produced a value of y yielding 
no solution pairs (x, y), and it produced extraneous x values. 

To find points (x, y) with f(x, y) = g(x, y) = 0, using the style of Sec- 
tions 7-10, we consider (f, g) as an ideal in C[X, Y], and we are interested 
in the locus of common zeros Vc((f, g)) of the ideal. We start by finding a 
reduced Grobner basis with respect to a suitable ordering. The usual lexicographic 
ordering will do fine here, and the result is {X + 5Y? = 3, y>—5Y +4}. By 
what may seem to be good fortune, the second element depends on Y alone, and 
the roots are y = 1 and y = $(-1 + ./17). If we substitute these values into 
the equation x + 5 y?>— 3 = 0, we get one value of x for each y. We can solve 
because the coefficient 1 of x is nonzero for each y in question. No pair (x, y) 
that we obtain is superfluous because the locus of common zeros of f and g is 
identical with the locus of common zeros of the members of the Grébner basis. 

This approach raises several questions about a possible generalization: 


(i) Under what conditions can we expect that a Grobner basis for an ideal J 
in K[X, Y] will contain a member that depends just on Y? 
(ii) If the Grdbner basis contains no element that depends just on Y, then 
what can we expect? 
(iii) If we are able to solve for values of y, under what conditions can we use 
the remaining member(s) of the Grébner basis to solve for x? 
Part of the answer to (i) is contained in the Elimination Theorem proved as 
Theorem 8.30 below. This theorem says for the lexicographic ordering that the 
members of a Grobner basis that depend just on Y generate 1M K[Y]; in fact, 


512 VIII. Background for Algebraic Geometry 


they form a Grobner basis of this ideal of K[Y]. For the case that J = (f, g), the 
resultant is a member of J M K[Y]. Thus a nonzero resultant ensures that some 
member of the Grébner basis will depend just on Y; on the other hand, 1 K[Y] 
has to be a principal ideal in K[Y], and any Grébner basis of that principal ideal 
has to contain the ideal’s generator (up to a scalar factor). By contrast, a zero 
resultant leads us to question (ii) because it says, by Theorem 8.1, that f and 
g have a common factor h(X, Y) of positive degree in X as long as both f and 
g have positive degree in X. The largest power of X in h has as coefficient 
a polynomial in Y that has only finitely many roots, and if K is algebraically 
closed, then every y unequal to one of these roots will produce an x such that 
h(x, y) = 0 and therefore such that f(x, y) = g(x, y) = 0. In other words, 
except in degenerate cases a zero resultant implies that there cannot be a member 
of the Grébner basis that depends just on Y. Finally the answer to (iii) lies deeper 
and is contained in the Extension Theorem, which is proved as Theorem 8.31 
below. 

Let J be a nonzero ideal in K[X,..., X,], K being any field for now. If 
0 < k < n-—41, then the k" elimination ideal of J is the ideal 
ITO K[Xg41,..., Xn] in K[Xg41,..., Xn]. A monomial ordering on 
K[X,,..., Xn] will be said to be of k-elimination type if any monomial con- 
taining any of X,,..., X, to a positive power is greater than any monomial in 
Xx41,---,X, alone. The usual lexicographic ordering is of k-elimination type 
for every k. An example of a monomial ordering of k-elimination type that is of 
great interest in applications is the one of Bayer—Stillman described in Example 4 
of monomial orderings in Section 7. 


Theorem 8.30 (Elimination Theorem). Let K be any field, let J be a 
nonzero ideal in K[X1,..., Xn], let 0 < k <n, and fix a monomial ordering 
of k-elimination type. If {g1,..., gs} is a Grébner basis of J, then the subset of 
members of {g1,..., gs} depending only on X;41,..., X, is a Grobner basis of 
the k" elimination ideal J = 1.0 K[Xj41,..., Xn]. 


PROOF. Relabeling the members of {g1, ..., gs}, we may assume that the g;’s 
lying in J are g;,..., g;. The first step is to show that J = (g,,..., g,). If 
f € J is given, we apply the generalized division algorithm (Proposition 8.20) 
and write f = ae 1 G8i +r with LM(a;g;) < LM(f) for all i and with no 
monomial appearing in r with nonzero coefficient divisible by LM(g;) for any 
j. Corollary 8.21 shows that r = 0. If a; # 0 andi is not < f, then LM(a;g;) 
involves at least one of X1,..., Xx, and the definition of monomial ordering of 
k-elimination type implies that LM(q; f;) > LM(f). It follows that a; = 0 for 
i >t, and thus J = (g1,..., g). 

To see that {g1,..., g;} is a Grobner basis of J, we apply Theorem 8.23. We 
are to show for each pair (g;, gx) with S(g;, gx) AO and {j,k} C {1,..., ¢} that 


10. Simultaneous Systems of Polynomial Equations 513 


there is an expansion S(g;, gx) = ae a; g; with LM(q;g;) < LM (S(g;, 8k)). In 
view of the argument with f in the previous paragraph, it is enough to show that 
S(gj, gx) lies in J. The formula is 

LCM(LM(g;), LM(gx)) LCM(LM(gx), LM(gx)) 


gj 
LT(g;) ; LT (gx) 
The coefficient fractions are members of K[Xx41,..., Xn], since the monomial 
ordering is of k-elimination type, and thus S(g;, gx) is indeed in J. 


S(8j, &k) = Bk. 


EXAMPLE. Formula for discriminant of a polynomial in one variable. This 
example is one that we have addressed before by specialized methods. We include 
it anyway because the use of Grobner bases allows one to solve many similar 
problems that the specialized methods do not address. By way of illustration, 
let (X — r)(X — s)(X — t) be a cubic polynomial. The discriminant is D = 
(r —s)*(s —t)*(r —t)*. Thisisa polynomial that is symmetric in r, s, t, and the 
general theory of symmetric polynomials (in the problems for Chapter VII in 
Basic Algebra) shows that it has to be a polynomial in the elementary symmetric 
polynomialsa =r+s+t,b=rs+rt+st,c =rst. We seek a formula for D 
in terms of a, b, c. We form the ideal J in K[r, s, t, D, a, b, c] given by 


L=(D (r s°(s t)(r t),a (r+s+t),b—(rs+rt+st),c rst). 


With the variables enumerated as r, s,t, D, a, b, c, we use any monomial order- 
ing of 4-elimination type, the lexicographic ordering for example, and form the 
reduced Groébner basis of 7. Calculation best done with the aid of a computer 
gives D — a*b? + 4b? + 4a3c — 18abc + 27c’ and three other members of / that 
involve r, s, or t. Theorem 8.30 shows that the 4™ elimination ideal is principal 
with generator D —a*b* + 4b? + 4a3c — 18abc +27c?. Thus the desired formula 
is D = a*b* — 4b? — 4a°c + 18abe — 27c?. 


Let us come to the Extension Theorem. The statement and proof of this theorem 
do not make use of Grobner bases, but they do refer to the k" elimination ideal, 
which is identified explicitly in Theorem 8.30 with the aid of a Grobner basis. 
The intention is that the theorem be applied inductively in any application, taking 
into account one additional variable at each step of an induction. 


Theorem 8.31 (Extension Theorem). Let K be an algebraically closed field, let 
Il=(fi,..., fs) be an ideal in K[X,,..., X;,], and let J be the first elimination 
ideal of J in K[X2,..., X,]. For each f;, expand fj in powers of X, as 


Fi (Xq, ..65 Xn) = Bi (Xo, ..., xx + (lower powers of X,) 


with gj in K[X2,..., X,] and g; nonzerounless f; = 0. Suppose that (c2, ..., Cn) 
lies in the zero locus Ve (J) C K"~!. If gi(co,..-,Cn) # 0 for some i, then there 
exists c, in K such that (cy, ..., Cy) is in the zero locus Vx (J) C K”. 


514 VIII. Background for Algebraic Geometry 


Before giving the proof, we need to extend the theory of the resultant slightly 
in such a way that it applies to s polynomials f|,..., f; rather than just to two. 
To do so, we introduce new indeterminates U2, ..., Us and regard 


F =U) fo +-->+U;s fs 


as a member of K[U2,..., Us, X1,..., Xn] whose degree deg, F in Xj is the 
maximum of the degrees of fo,..., fs in X1. We can then view f; as amember of 
the same polynomial ring K[U2,..., Us, X1,..., Xn] of degree deg, f; and form 
the resultant of f; and F in the X, variable. This is computed as the determinant 
of some square matrix of size deg, f; + deg, F, and we are interested only in 
the case that deg, f; > 1 and deg, F > 1. When expanded in monomials 
U% = U;” ---U®, the determinant is of the form 


R(fi, F) = )- ha(Xa, «. + Xn)U* 


with each hy in K[X2,..., Xn]. The polynomials hy will be called the general- 
ized resultants in the X, variable of the ordered pair (f), {fo,..., fs}). 


PROOF OF THEOREM 8.31. Let us abbreviate X = (X>,...,X,) and é = 
(Co, ..., Cn); we Shall write 
ey 5) OG ag a) and (X1, €) = (X11, €2,...,€n). 


We seek c; € K with fj(c),c) = 0 for all j. The assumption is that g;(c) 4 0 
for some i, and we may as well assume that this i isi = 1. If deg, f; = 0, then 
fi; 1s in J, and the conditions that f; = 0 on Vx (J) and that g;(c) 4 0 contradict 
one another; hence deg, fi > 1. 

As in the paragraph before the proof, put F = U2 fo+---+U;s fs. Ifdeg, F = 0, 
then fj is independent of X, for all j > 2, and hence f; is in J for j > 2. In this 
case it is enough to find c; with f\(c;,c) = 0. Since g,(c) 4 0, fi (X1, ¢) isa 
one-variable polynomial of degree /; > 1, and it is 0 for some value c. Thus the 
proof is complete if deg, F = 0. 

We may therefore assume that deg, F > 1. Form the resultant in X; given by 


R(fi, F) = ha X)U*, 


where the h,’s are the generalized resultants mentioned above. The main step is 
to prove that each hg lies in the first elimination ideal J. Since hy depends only 
on X, it is enough to prove that each hy is in J. We have arranged that each of f; 


10. Simultaneous Systems of Polynomial Equations 515 


and F has positive degree and has nonzero leading coefficient in X,, and hence 
Theorem 8.1 shows that 


afi; +bF = R(fi, F) 


for some nonzero polynomials a and b in K[U2,..., Us, X1, X|. Let the mono- 
mial expansions of a and b in terms of the U%’s be a = )0,,d,U® and b = 
yoy PaU®. Then we have 


Fae fU" +(TbpUP)( fii) = Tha", (*) 


Let e; be the multi-index that is 1 in the i place and 0 elsewhere. This has the 
property that U“ = U; for 2 <i <s. Wecan rewrite («) as 


YhaU* = Vag fitU% +2 ( YO bp fi)ur. 
a a a — (B,i) with 

2<i<s, 

B+e;=a 


Equating the coefficients of U® on both sides gives 


ha = ae fi + s be fi 
(B,i) with 
2<i<s, 
B+e;=a 


and exhibits hy as in J. Therefore h, is in the elimination ideal J. 
Since ¢ lies in Vg (J), ha (c) = 0 for all wa. Consequently 


R(fi, F)(U2, ..., Us, €-) = 0. 


Theorem 8.1 shows that f;(X),c) and F(U2,...,U;, X1,c) have a common 
factor of positive degree in X provided either or both of two specific coefficients 
are nonzero. These are the coefficients of X aa fin fi (X1, ©) and of X ‘a Pin 
F(Up,...,Us, X1,€). The coefficient of X“"'" in f,(X1, X) is gi(X); thus 
the coefficient of X oe fin fi(X1, €) is gi(c) and is nonzero by assumption. 
Therefore Theorem 8.1 is applicable. 

The common factor of f;(X 1, ¢) and F(U2,..., Us, X1, c) may be taken to 
be prime, and then it has to be a nonzero scalar multiple of X; — c, for some 
c, € K, since that is the only kind of prime factor that divides f;(X,, c), K being 
algebraically closed. Thus the element c; of K satisfies 


filcr,€)=0 and = F(Up,...,U,,c1,@) =0. (+) 


516 VII. Background for Algebraic Geometry 


Writing out F, we have 
0 = FU, ahd: | Us, C1, Cc) = U2 falcr, Cc) + aks + Us; fs (C1; C). 


This is an identity in K[U2,..., Us], and each coefficient must be 0 on the right 
side. Thus O = fo(c),¢c) =--- = fs(c1, C). Since («) shows that fi (c1, c) = 0, 
this proves the theorem. 


11. Problems 


How many points are in P{, if K is a finite field with g elements? 

Resolve Cramer’s paradox as formulated in Section 1. 

(Euler’s Theorem) Prove that if F(X,,..., X,) is any homogeneous polyno- 
mial of degree d, then )0"_; Xj am; =dF. 


4. Let A and B be unique factorization domains, and let. : A — B be a one-one 
homomorphism of commutative rings with identity. For each h(X) in A[X], let 
h'(X) be the member of B[X] obtained by applying the substitution homomor- 
phism that acts by z on the coefficients and fixes X. Using resultants, prove that 
if f(X) and g(X) are two members of A[X] such that f'(X) and g'(X) have a 
common factor in B[X] that is not in B, then f and g have a common factor in 
A[X] that is not in A. 


5. Theorem 8.1 assumes that at least one of the coefficients f,, and g, is nonzero. 
Sometimes this theorem is phrased with the stronger hypothesis that f,, and gy 
are both nonzero. By comparing the resultants that are involved, show that all 
parts of the theorem with at least one of fj, and g, nonzero are consequences of 
the theorem with both f,, and g, nonzero. 

6. Let K bean algebraically closed field, let f and g be members of K[X,..., Xn] 
with f irreducible, and suppose that g(a), ...,@,) = 0 whenever f(a), ..., Gn) 
= 0. Give two proofs, one using the Nullstellensatz and one using resultants, 
that f divides g. 

7. Factor the member Y? — 2XY2 + 2X?Y — 4X3 of C[X, Y]3 into first-degree 
factors. 


8. Find the intersections in ere of the zero loci of the projective plane curves 
F(X, Y, W) = X(¥* — XW)? — Y° and G(X, Y, W) = ¥*+ Y9wW — xX?w?. 

9. Let A be a unique factorization domain, let B = A[Yj,..., Ym, Z1,.--, Zn], let 
F and G be the polynomials in B[X] given by 


F(X)=[T[(X-¥;) and G(X) = [] (X-Z)), 
i=l a 


J 
and let R(Y1,..., Ym, Z1,-.., Zn) be the resultant R(F’, G) with respect to X. 


(a) Show that R(Y1,..., Yn, Z1,-.-, Zn) equals 0 if Y; is set equal to Z;. 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


11. Problems 517 


(b) Deduce from (a) that Y; — Z; divides R(Y|,..., Yn, Z1,..., Zn). 
(c) Deduce from (b) that R(Y,..., Yin, Z1,...,Zn) = eT]; (Y; — Z;) for 
some c # 0 in A depending on m and n. 


Let f(X) be in K[X], K being a field, and let f’(X) be the derivative of f(X). 
Using the result of the previous problem and the computation at the beginning 
of Section V.4, prove that R(f, f’) is a nonzero multiple of the discriminant of 
f, the multiple depending only on deg f. 


Let F and G be the homogeneous polynomials given by F(X,Y,W) = 
(X? + Y*)? + 3X?2YW — Y?W and G(X, Y, W) = (X?4+ Y”)? —4x?Y?w?. 
Calculate 7(P, FN G) for P = [0, 0, 1]. 


Let G be a nonconstant homogeneous polynomial in K[X, Y, W]qg vanishing at 
a point P of Py, let m = mp(G) be the order of vanishing of G at P, and let 
L be a projective line through P. Show from the definitions that L is a tangent 
line to G at P in the sense of Section 5 if and only if i(P, LAG) >m-+1in 
the sense of Section 4. 


Deduce relative to an arbitrary monomial ordering the (nonconstructive) exis- 
tence of a Grébner basis for a nonzero ideal J in K[X,..., X,] from the form 
of a set of generators of the ideal LT(/). 


For 1 <i <n, let w be the weight vector wO = (wi, bbe w?) in R”, and 
suppose that these vectors are linearly independent. Show that the w define a 
monomial ordering as in Example 5 of Section 7 if and only if for each j, the 
first 7 with we) # 0 has wi > 0. 


This problem shows for two variables that every monomial ordering arises from a 
system of two independent weight vectors satisfying the condition in the previous 
problem. Let a monomial ordering be imposed on K[X, Y]. 

(a) If X > Y¢4 for all g > O, show that the ordering is lexicographic and is 
determined by the system of two weight vectors {(1, 0), (0, 1)}. 

(b) If X < Y4 for some g > 0, show that there exists a unique real number 
r > 0 such that for all ordered pairs of integers u > 0 and v > 0, X“ > Y” 
ifru > vand X“ < Y"ifru <v. 

(c) If X < Y% for some g > 0 and if r is defined as in (b), prove that the 
monomial ordering is determined by the system of two weight vectors 
{(r, 1), (s, t)} for a suitable (s, f). 


In K[X, Y], define f(X, Y) = X*Y + XY? 4+ Y’, fi(X,Y) = XY — 1, and 
f2(X, Y) = Y? — 1. Show that 


A{®KYNY=(K4tNYfitlhien=xXfiit+X+VDfpt4+1r 


518 VIII. Background for Algebraic Geometry 


with r1(X, Y) = X + ¥Y + 1 andr2 = 2X + 1 gives two decompositions in the 
lexicographic ordering of f relative to {f1, f2} satisfying the conditions of the 
generalized division algorithm of Proposition 8.20. Conclude that the remainder 
term need not be unique, nor need the coefficients of f; and fo. 


17. Observe for any scalar a that the ideal J = (X? + cXY, XY) in K[X,Y] is 
independent of c. 
(a) Verify that {X* + cXY, XY} is a minimal Grobner basis of / relative to the 
lexicographic ordering for any choice of c. 
(b) Show that {X*, XY} is the reduced Grobner basis for /. 


Problems 18—20 characterize ideals in K[X 1, ..., X,] whose locus of common zeros 
is a finite set under the assumption that K is an algebraically closed field. Thus let 
K be an algebraically closed field, and let J be a nonzero ideal in K[X1,..., Xp]. 


18. Under the assumption for each j with 1 < j <n that J contains a nonconstant 
polynomial P;(X;), prove that Vx (J) is a finite set. 


19. Conversely under the assumption that Vx (/)) is a finite set, use the Nullstellensatz 
to produce for each j, a nonconstant polynomial P;(X;) lying in /. 


20. Impose the usual lexicographic ordering on monomials. Prove that LT(/) con- 


tains some X" for each j with 1 < j <n if and only if Vx«(J) is a finite 
set. (Educational note: The advantage of this characterization over the one in 
Problems 18-19 is that checking this one is easy by inspection once a Grobner 
basis of J has been computed.) 


Problems 21-23 relate solutions of simultaneous systems of polynomial equations to 
the theory of the Brauer group in Chapter III. A field L is said to satisfy condition 
(C1) if every homogeneous polynomial of degree d inn variables with d < n has a 
nontrivial zero. The significance of this condition was shown in Problem 20 at the 
end of Chapter III: the Brauer group B(L) of sucha field is necessarily 0. The present 
set of problems establishes that a simple transcendental extension of an algebraically 
closed field satisfies condition (C1). No knowledge of Chapter III is needed for these 
problems, but Problem 23 will take for granted a certain theorem to be proved in 
Chapter X. 


21. Let K bean algebraically closed field, and let L = K (X) beasimple transcenden- 


tal extension. It is to be shown that any member F(7),..., T,) of LIT), ..., Tnla 
of the form F(T|,...,7,) = ees Oj psts Ts ee Ti" has a nontrivial zero if 
d <nandeachqj,,.;, lies in the field L = K(X). 

(a) Why is it enough to consider such polynomials with each aj,,_;, in the 


polynomial ring K[X]? 


22. 


23. 


11, Problems 519 


(b) With the simplification from (a) in place, let 5 be the maximum degree in 
X of the coefficients a;,...;,. Let N be a positive integer to be specified. By 
looking for a solution of the form 7; = ae bij X / with each bj; in K, show 
that substitution of this formula into the formula F(T,,..., T,) = 0 leads 
to a system of homogeneous polynomial equations over K in the unknowns 
bj;, one of each degree from 0 to 6 + Nd. 


(a) In the setting of the previous problem, show that the number of unknowns 
is (NV + 1)n and that the number of equations is at most Nd + 6 + 1. 

(b) Show for N sufficiently large that the number of equations is less than the 
number of unknowns. 


The following theorem will be discussed in Chapter X: if K is algebraically 
closed and if m < n, then the locus of common zeros in PE of m nonconstant 
homogeneous polynomials in K[X1,..., Xn41] is nonempty. Assuming this 
theorem, deduce from the previous two problems the conclusion that the field 
L = K(X) satisfies condition (C1) if K is algebraically closed. 


CHAPTER IX 


The Number Theory of Algebraic Curves 


Abstract. This chapter investigates algebraic curves from the point of view of their function fields, 
using methods analogous to those used in studying algebraic number fields. 

Section 1 gives an overview, explaining how Riemann’s theory of Riemann surfaces of functions 
ties in with the notion of an algebraic curve and explaining how such curves can be investigated 
through the discrete valuations of their function fields. It is shown that what needs to be studied is 
arbitrary function fields in one variable over a base field. It is known that every compact Riemann 
surface can be viewed as an algebraic curve irreducible over C, and thus the function fields of 
compact Riemann surfaces are to be viewed as informative examples of the theory in the chapter. 

Section 2 introduces the notion of a divisor, which is any formal finite Z linear combination of 
the discrete valuations of the function field that are trivial on the base field, and the notion of the 
degree of a divisor, which is the sum of its coefficients weighted suitably. Each nonzero member 
x of the function field gives rise to a principal divisor (x), and the main result of the section is that 
the degree of every principal divisor is 0. This is an analog for function fields of the Artin product 
formula for number fields. 

Section 3 contains the definition of the genus of the function field under study. The main object 
of study is the vector space L(A) for a divisor A; this consists of 0 and all nonzero members x of 
the function field such that (x) + A is a divisor > 0. Roughly speaking, it may be viewed as the 
space of functions on the zero locus of the curve whose poles are limited to finitely many points and 
to a certain order depending on the point. The genus is defined in terms of dim L(A) — deg A when 
A is a divisor that is a large multiple of the pole part of any fixed principal divisor. The main result 
of the section is Riemann’s inequality, which says that dim L(A) > deg A + 1 — g for all divisors 
A, where g > 0 is the genus, and that g is the smallest integer that works in this inequality for all 
divisors A. 

Sections 4—5 concern the Riemann—Roch Theorem, which gives an interpretation of the difference 
of the two sides of Riemann’s inequality as dim L(B) for a suitable divisor B that can be defined in 
terms of A. Section 4 gives the statement and proof of the theorem, and Section 5 gives a number 
of simple applications. 


1. Historical Origins and Overview 


As was mentioned in Chapter VII, modern algebraic geometry grew out of early 
attempts to solve simultaneous polynomial equations in several variables and out 
of the theory of Riemann surfaces. Chapter VIII discussed the impact of the first 
of these sources, and the present chapter discusses the impact of the second. 


520 


1. Historical Origins and Overview 521 


The theory of Riemann surfaces was begun by Riemann and continued by 
Liouville, Abel, Jacobi, Weierstrass, and others. This section discusses briefly 
the point of view in these studies, which began as an effort to solve a problem in 
real analysis, moved into complex analysis, and finally arrived at investigations of 
affine plane curves over C, but from a point of view quite different from the one in 
Chapter VIII. The end result is a study of the curve through the functions on its zero 
locus, and the approach has something in common with the approach to algebraic 
number theory in Chapter VI. It is not necessary to understand the background in 
maximum generality, and we shall be content with suitable examples. 

Riemann was interested in saying something useful about seemingly intractable 
integrals like the one arising from the arc length of an ellipse; let us take 


es ir dt 
Y=HYH)= , 

V(t — ay(t — b(t — c) 
where a, b,c are distinct constants, as a specific example. The lower limit of 
integration is unimportant, since it affects the value of the integral only by an 
additive constant. We sketch an analysis of the integral,! proceeding formally for 


the moment. Although y as a function of x seems intractable, any sort of inverse 
function has nice properties. The formula for y gives us 


dx 
JV@-ae —b)a—c) 


and an inverse function x = x(y) thus has derivative 


dy = 


dx 
oii V(x —a)(x —b)(x—c). 
y 
Consequently we should expect that 


ea = (x — a)(x — b)(x —c). 
dy 

Of course, the singularities at a, b, c are problematic, and the square root might 
have a negative argument, depending on the location of x. 

Riemann’s starting point for a rigorous investigation was to let x be complex, 
rather than real, and to let the integral be taken over paths in C. The result is 
then not an ordinary function y(x), since the square root in the integrand is not 
a well-defined function for t in C — {a, b, c}. We can make a choice for which 
the square root is well defined, however, as long as we restrict attention to a 
small neighborhood of a particular t. Thus we can visualize small overlapping 
disks each centered at a point along an arbitrary path of integration with f in 
C — {a, b, c} with the property that the integrand is well defined on each such 


'For more details one can consult the author’s book Elliptic Curves, pp. 165-183. 


522 IX. The Number Theory of Algebraic Curves 


disk. The interpretation of the square root may be assumed to match on the 
intersection of any two disks. When a path goes around one or more of the 
singularities and we return to the same ft, we view the new disk as the same as 
the old one if the values of the square root match, but as different if the values do 
not match. The union of the disks with this convention becomes a new domain 
of interest, and the function F(t) = /(t — a)(t — b)(t —c) on C — {a, b,c} 
becomes a well-defined function F'(¢) on this new domain. This new domain is a 
relatively simple example of a Riemann surface, i.e., a connected 1-dimensional 
complex manifold. 


In more modern language the new domain is a twofold covering of the three- 
times punctured plane C — {a, b, c}, obtained as follows. We fix a base point zo 
in C — {a, b,c} and define a winding number for each of the points a, b, c as 
usual. The subset of the fundamental group of C — {a, b, c} for which the sum of 
the three winding numbers is even is a subgroup and corresponds, via standard 
covering-space theory, to a certain twofold covering space R of C — {a, b, c}, 
the covering map being called e. This covering space is a new domain on which 
the integrand is well defined. On each fiber of the covering, e is two-to-one. Let 
o be one of the two preimages of zo. Let us adjoin points a*, b*, c*, oo* to the 
covering space R and extend e by the definitions e(a*) = a, e(b*) = b, e(c*) =c, 
e(co*) = oo. One can show that the complex structure extends from FR to the 
enlarged space * in such a way that the extended e is a holomorphic function 
from R* onto C U {oo}. The enlarged space #* becomes a compact Riemann 
surface, and the extended e is a branched covering of the Riemann sphere CU {oo}. 
Topologically * turns out to be a torus, as we shall see in a moment. 


Riemann in his own investigations went on to study the function theory of 
compact Riemann surfaces. The interest is in deciding whether there is a globally 
defined meromorphic function with poles/zeros only at chosen points and with 
poles/zeros at most/least of some specified order. If there is such a function, 
one wants to know the dimension of the space of such functions. The basic tool 
for addressing this question is the Riemann—Roch Theorem. In the context of 
Riemann surfaces, the Riemann—Roch Theorem has both an analysis aspect and an 
algebraic aspect. The analysis aspect may be viewed as using the theory of elliptic 
differential operators to prove existence of enough nonconstant meromorphic 
functions for the Riemann surface to acquire an algebraic structure. For the 
purposes of this book, we can just accept this circumstance and not try to extend 
it in any way; however, we will sketch in a moment how the algebraic structure 
can be obtained concretely for our example. The algebraic aspect may be viewed 
as mining this algebraic structure to deduce as many dimensionality relations 
as possible among the function spaces of interest. This is the theory that we 
shall want to extend; we return to our method for carrying out this project after 
producing the algebraic structure for our example by elementary means. 


1. Historical Origins and Overview 523 


To introduce the algebraic structure in our example, we use our knowledge of 
e* to make sense out of the expression 


w(C) = [ F(t) dé 


for any piecewise smooth curve C on 7* that starts from the base point {. 
If C is given by C(t) for ¢ in an interval J, then this integral is to be equal to 
w(C) = ie F(C(t))(eoC)(t) dt. Letl,, Ty, P- be small loops in C— {a, b, c} 
respectively about a, b, c based at zo, each having winding number 1, and define 
Tl, =T,l, andl, =T;I.,. Lift [) and I, to curves ie and te in R* based at 
, and define 


| = [ F(¢é) 'dgé and a = [ F(g) dé. 
r MP 

It turns out that A = Zw, + Za is a lattice in C and that there is a well-defined 

function w : R* — C/A such that whenever ¢ is in R* and C is a piecewise 

smooth curve from ¢ to ¢, then w(¢) = w(C) mod A. The function w(Z) is one- 

one onto and is biholomorphic. In particular, * is exhibited as homeomorphic 

to a torus. 

Let w-! : C/A — R* be the inverse function of w, and let uw : C > C/A be 
the quotient map. Then the composition P = eow7! oy carries C to CU {oo} and 
can be seen to satisfy P’? = (P — a)(P — b)(P —c). In other words, P has been 
constructed rigorously as an inverse function to the original integral. Except for 
small details, P is the Weierstrass go function for the lattice A in C. It is almost 
true that z +> (P(z), P’(z)) is a parametrization of the zero locus of the affine 
plane curve y* — (x — a)(x — b)(x — c) defined over C. The sense in which this 
parametrization fails is that P(z) takes on the value oo at certain points. What 
happens more precisely is that z + [P(z), P’(z), 1] is a parametrization of the 
zero locus of the projective plane curve Y*W — (X —aW)(X — bW)(X —cW). 


Our initial focus in this chapter is in mining this kind of algebraic-curve 
structure over C to deduce as many dimensionality relations as possible among 
interesting finite-dimensional subspaces of scalar-valued functions on the zero 
locus of the curve. For instance in the example above, one can ask for the 
dimension of the space of meromorphic functions on 7* with at worst simple 
poles at two specified points and with no other poles. The main theorem of this 
chapter, the Riemann—Roch Theorem, gives quantitative information about the 
dimension of this space and of similar spaces. The goal for this introduction is 
to frame this question as an algebra question about the algebraic structure and 
to see that some basic tools introduced in Chapter VI in the context of algebraic 
number theory are the appropriate tools to use here. 


524 IX. The Number Theory of Algebraic Curves 


The primary object of study is the “function field” of the curve in question. 
Let us construct this function field for our example. The ideal 


I = (¥? — (X —a)(X —b)(X —0)) 


in C[X, Y] is prime, and the restrictions of all polynomial functions to its zero 
locus V(J) may be identified with the integral domain R = C[X, Y]/J by the 
Nullstellensatz. It takes a little argument, which we omit, to justify saying that the 
meromorphic functions on the zero locus may be viewed as the field of fractions 
F of CLX, Y]/J; suffice it to say for the moment that we insist that the behavior at 
all points of the locus, including any points on the line at infinity in the projective 
plane, be limited to poles and zeros, and that is why nonrational functions of 
(X, Y) do not appear. At any rate, F is what is taken as the function field of the 
curve. To have obtained a field by this construction, we could have started with 
any affine plane curve f(X, Y) over C as in Chapter VIII, except that the principal 
ideal (f (X, Y)) in CLX, Y] has to be assumed to be prime to yield an integral 
domain as quotient. That is, f(X, Y) has to be an irreducible polynomial; we say 
that the affine plane curve f(X, Y) has to be assumed to be irreducible over C. 

The study of members of the function field F from the point of view of their 
poles and zeros is analogous to the problem of studying factorizations in the 
number-theoretic setting. This point was already made in Section VIII.7 of Basic 
Algebra, where the case of the affine plane curve above in which (a,b,c) = 
(0, +1, —1) was studied in detail. For this one choice of (a, b,c), the integral 
domain R = C[X, Y]/I was observed to be the integral closure of CLX] in a 
finite separable extension of C(X), and it is a Dedekind domain by Theorem 8.54 
of Basic Algebra; in fact, the same argument works for any choice of (a, b, c) as 
long as a, b, c are distinct complex numbers. 

Unique factorization of elements into prime elements fails in this R, but we 
saw that a geometrically meaningful factorization instead is the factorization of 
nonzero ideals into prime ideals. This latter factorization is unique because R is a 
Dedekind domain. Meanwhile, since nonzero prime ideals are maximal in R, the 
Nullstellensatz shows? that the nonzero prime ideals in R correspond exactly 
to the points of the zero locus V(/). Consequently the unique factorization of 
nonzero ideals in R has the geometric interpretation of associating orders of zeros 
and poles to members of R. This all seems very tidy, but there are at least three 
awkward matters that we need to take into account: 


*Let y : C[X, Y] — R be the quotient homomorphism. If M is a maximal ideal in R, then 
y—!(M) is a maximal ideal in C[X, Y] and hence is the set of all polynomials vanishing at some 
(xo, yo). To show that (xo, yo) is in V(/), assume the contrary. Then there exists g € J with 
g(xo, Yo) # 0. This g is not in the maximal ideal g~!(M), and thus there exist f €¢ g~!(M) and 
h € C[X, Y] with f + gh = 1. Applying ¢, we obtain g(f) = 1, in contradiction to the fact that 
y(f) lies in the proper ideal M of R. 


1. Historical Origins and Overview 525 


(i) we have not included information about zeros and poles at the points at 
infinity when the curve is viewed projectively, and that information surely 
plays some role, 

(ii) the analysis of the function field F seems to rely on a subfield C(X) for 
which there is surely no canonical description, 

(iii) the ring R no longer need be integrally closed if a, b, c are not assumed 

distinct, if for example (a, b, c) = (0, 0, 1). 

Point (ii) turns out to be an advantage, allowing us to work with the given curve 
from multiple perspectives. The “key observation” at the end of this section will 
make clear how we can take advantage of (11). 

Point (iii) is quite significant. The trouble with the curve Y* — X*(X — 1) is that 
the curve has a singularity at (0, 0) in the sense of Section VII.5. The maximal 
ideals of the ring CLX, Y] if (Y 2_ x(x — 1)) correspond to points on the zero 
locus of the curve; but the ring is not a Dedekind domain, and we have few tools 
for working with it. To handle matters properly, we have to form the function 
field directly as F = C(X)[Y]/(¥? =x 1)) and define R to be the integral 
closure of C[X] in F. This ring R is bigger than CLX, Y\/(¥? =X7(X= 1) and is 
a Dedekind domain. Unfortunately its nonzero prime ideals no longer correspond 
exactly to points of the zero locus. Example 1 below will illustrate. What happens 
is that F readily provides information about the behavior of nonsingular points 
of the zero locus but not about singular points. Problems 5—11 at the end of the 
chapter address this matter for nonsingular points for affine plane curves more 
generally. The tool for making the connection for curves in higher dimension is 
Zariski’s Theorem (Theorem 7.23), and we shall carry out the details in Chapter X 
when we treat the geometry of curves, as opposed to the number theory. 

Point (i) is relevant and is easily handled. When we form the function field 
of the curve and take R to be the integral closure of CLX] in it, we can associate 
CLX] with the polynomials of C and think of them as embedded in the field C(X) 
of rational functions. The rational functions are all meaningful on the Riemann 
sphere C U {oo}, and we study behavior of rational functions near oo by writing 
them in terms of X~! and regarding X~! as a new variable that is near 0. In 
studying our curve, the points in the projective plane that we miss by considering 
just the affine curve are the ones that lie over oo in the Riemann sphere. We 
study them by considering the integral closure R’ of C[X~'] in F. If the curve is 
nonsingular at all points lying over oo, then these points correspond to the prime 
ideals of R’ whose intersection with C[X~'] is the prime ideal X~!'C[X~'] of 
C[X7!]. 


EXAMPLES. 
(1) Affine plane curve f(X,Y) = Y* — X?(X — 1). This polynomial is 
irreducible over C but is singular at (0,0) in the sense that aE and ue both 


526 IX. The Number Theory of Algebraic Curves 


vanish there. Let F = C(X)[Y]/(f(X, Y)), and let x and y be the images of 
X and Y in F. These elements lie in the ring S = C[X, Y]/(f(X, Y)), whose 
maximal ideals correspond to points on the zero locus by the Nullstellensatz. 
All members of S are of the form a(x) + yb(x), where a and b are arbitrary 
polynomials in one variable. Any proper ideal in S containing x has to be of the 
form (x, yc1(x),..., yCn(x)) for some polynomials c1, ..., Cn. A little argument 
using the fact that C[x] is a principal ideal domain shows that the ideal is of the 
form (x, yce(x)). Using products of x and polynomials, we see that we can discard 
all terms of c(x) but the constant term. Hence the ideal is either (x) itself or is 
(x, y). The ideal (x) is not prime, since y - y is in it and y is not in it. The ideal 
(x, y) is maximal and hence prime. Since (x, y)? = (x, xy, y*) = (x?, xy) is 
properly contained in (x), (x) is not the product of prime ideals in S. Thus S is 
not a suitable ring for investigating poles and zeros of members of the field F. 
By contrast, a little computation shows that the integral closure R of C[x] in F 
is generated as a C algebra by x and x~!y. This is a Dedekind domain, and the 
decomposition of the ideal (x) in R as a product of prime ideals can be checked to 
be (x) = (x, x7! y +i)(x, x7! y — i). A factor on the right does not consist of all 
functions vanishing at some (0, yo) lying on the zero locus. The only point (0, yo) 
on the zero locus is (0, 0), and the two prime factors of (x) say something about 
derivatives at that point. This example will be considered further in Problems 
21-22 at the end of the chapter. 


(2) Affine plane curve f(X, Y) = Y? — X4+1. This polynomial is irreducible 
over C and is nonsingular at every point of its zero locus in C?. Again we form the 
function field F, the members x and y of it, and the ring CLX, Y]/(f(X, Y)). Us- 
ing the fact that X*— 1 is square free, we can check that this ring is the full integral 
closure R of C[x] in F. The ring R is a Dedekind domain, and its elements are all 
expressions a(x) and yb(x), where a(x) and b(x) are polynomials. Moreover, 
we have (y + x?)(y — x7) = y? — x4 = (x4 — 1) — xt = —1. Consequently 
the elements y + x? are nonconstant units in R, and they cannot have zeros or 
poles on the zero locus of f(X, Y) in C”. Thus knowledge of the orders of zeros 
and poles at every point of the zero locus of f(X, Y) in C* does not determine 
a member of R up to a constant factor. Instead, we have to take into account 
the behavior at any points at infinity on the zero locus in the projective plane Pe. 
To see what this set is, we convert f(X, Y) into a homogeneous polynomial of 
degree 4, specifically into F(X, Y, W) = Y*W? — X* + W%, and then we look 
for points [x, y, w] with F(x, y, w) = 0 and w = 0. These have x = 0 and thus 
come down to [0, y, 0]. In other words, there is only one point at infinity on the 
zero locus of the curve. It is singular because all three partial derivatives of F 
are 0 there. The fact that it is singular means that we should not expect the 
prime ideals lying over x~'C[x~'] in the integral closure R’ of C[x~'] in F to 
correspond to the points at infinity on the curve. We return to this example shortly. 


1. Historical Origins and Overview 527 


All these matters begin to sound quite complicated to sort out, but magically 
there is a simple way of handling them: for an affine plane curve irreducible 
over C, we work with the field F of rational functions for the curve, ignoring 
the geometry of the curve, and we consider all discrete valuations on this field 
that are 0 on C*. Discrete valuations were discussed at length in Section VI.2. 
They depend only on F, not on the choice of a subring for which F is the field 
of fractions. As will be seen in Chapter X, the full set of discrete valuations of 
F gives information about all potential nonsingular points for any affine curve 
with function field F, not necessarily planar; there will even be such a curve 
whose extension to be defined projectively is everywhere nonsingular, and then 
the points on the zero locus of the curve in projective space will be in one-one 
correspondence with the discrete valuations of F. 

Let us review what Chapter VI tells us about discrete valuations in our set- 
ting. Let f(X, Y) be an irreducible polynomial in C[X, Y], let F be the field 
C(X)[Y]/Cf (CX, Y)), let x and y be the images of X and Y in F, and let R be the 
integral closure of C[x] in F. This is a Dedekind domain by Theorem 8.54 of 
Basic Algebra. Corollary 6.10 classifies the discrete valuations of F that are 0 on 
C*. It shows that all but finitely many correspond to prime ideals in R. There 
are only finitely many others. Corollary 6.10 tells us that these other discrete 
valuations can be described in terms of the integral closure R’ of C[x7!] in F; 
this is another Dedekind domain whose field of fractions is F. The exceptional 
discrete valuations of F arise from those prime ideals of R’ that occur in the 
decomposition of the ideal x~! R’ into prime ideals of R’. Geometrically we may 
view these additional discrete valuations as associated in some way with points at 
infinity in a projective space, but we can proceed with algebraic manipulations of 
these discrete valuations without invoking the geometric interpretation or using 
projective space. 


EXAMPLE 2, CONTINUED. We continue with the affine plane curve Y 2 NAT 
the prime ideal J = (Y* — X*+1), and the ring R given as the integral closure of 
C[X] in the field F = C(X)[Y]/7. Corollary 6.10 divides the discrete valuations 
of F that are 0 on C% into two kinds. The ones of the first kind are built from the 
nonzero prime ideals of R. Since y + x? are units in R, all of these valuations 
take the value 0 on y + x”. The discrete valuations of the second kind are those 
appearing in the decomposition of the ideal x~'R’ in the integral closure R’ of 
C[x~'] in F. The element x~*y is in R’ because it is a root of the polynomial 
Y? — (1 — x74) in C[x7!][Y]. Hence R’ contains x~! and x~*y. On the other 
hand, the most general element of F is of the form a(x~ se ie y+b(x— !) where a 
and b are rational expressions in one variable, and this is a root of the polynomial 


Y? — 2b(x')¥ + (bx!) — ae!) — x). 


528 IX. The Number Theory of Algebraic Curves 


For this element to be in R’, the coefficients must be in C[x~!]. This means that 
b(X) is a polynomial and that a(X)?(1 — X*) is a polynomial. Since 1 — X* has 
no repeated roots, the latter condition forces a(X) to be a polynomial. Thus x~! 
and x~7y generate R’ as a C algebra. Define ideals in R’ by 


Pp=(xt,x-7y+1) and Pp =(x7!,x-7*y— 1D). 
Then it is straightforward to check the decompositions 
GXS PB, Gr yhiypSsey, cd CG yodSP: 


Since [F : C(x~!)] = 2 and since x! is prime in C[x~'], the ideal (x~!) in R’ is 
the product of at most two prime ideals, and it follows that P; and P2 are prime 
ideals in R’. They are distinct because the difference of the respective second gen- 
erators is anonzero scalar. In view of Corollary 6.10, there are exactly two discrete 
valuations of F that are 0 on C* other than the ones coming from prime ideals of 
R, and these are the ones coming from the prime ideals P; and P2 of R’. Let us call 
them v; and v2. The above decompositions of principal ideals give v1 (y +x?) = 
v(x)? + (ay +1) = (2) + G4) = 42, whereas u(y — x?) = 
(—2) + (0) = —2. Thus v takes the distinct values 0, +2, and —2 on 1, y + x?, 
and y — x’. Similarly v2 takes the values 0, —2, and +2 on these elements. 


We shall work with those discrete valuations of the field of rational functions 
for the curve under study that are 0 on the base field. These are canonical, 
independent of our choice of some Dedekind domain whose field of fractions is 
the given field. However, making a choice of Dedekind domain is convenient 
for making calculations. Then we can consider the discrete valuations as of two 
kinds, and which discrete valuations are of which kind will depend on our choice 
of Dedekind domain. 


Context for the study in this chapter. Having concluded that the object to 
investigate is the field of rational functions of our curve and that the tools include 
the discrete valuations, we can now consider the context in which we should 
work. Let k be any field, not necessarily algebraically closed. We want to work 
with the “function field” of a suitable kind of curve defined over k. If J is an ideal 
in k[X1,..., Xn], then the ring R = k[X,..., Xn]// is an integral domain if 
and only if the ideal J is prime, and in this case the field of fractions F of R can 
be taken to be the associated function field. Thus we restrict attention to the case 
that J is prime. To bring in the notion that the curve is to be 1-dimensional, we 
recall from Theorem 7.22 that the integral domain R has Krull dimension 1 in 
the sense of Section VII.4 if and only if the field of fractions F has transcendence 
degree | over k. In this case, F is finitely generated as a field over k, with a finite 
set of generators consisting of the elements x; = X; + / for 1 < j <n. That is, 
F is a function field in one variable over k. 


1. Historical Origins and Overview 529 


Conversely if F is a function field in one variable over k, then F is a finite 
algebraic extension of a simple transcendental extension k(x,). Let us write it as 
F = k(x) [%, ..., X,] for some n. Form the polynomial ring k[X,, ..., X,,] and 
the ring homomorphism of this ring into F that fixes k and sends X; into x;. The 
image of this homomorphism is an integral domain R whose field of fractions is 
F, and the kernel is a prime ideal J such that R = k[X,,..., X,]/J. Theorem 
7.22 tells us that R has Krull dimension 1. 

We are led to the following definition. For any field k and any integer n > 1, 
an ideal J in k[X),..., X,] is called an affine curve irreducible* over k if J is 
prime and the integral domain R = k[X1,..., X»]/J has Krull dimension 1. An 
affine plane curve (f (X, Y)) in the sense of Chapter VIII will be an object of this 
kind if f (X, Y) is an irreducible polynomial.* 

The geometry of the zero loci of the curves we study will not play a role in the 
mathematics of this chapter; only the field of fractions F and the base field k will. 
We postpone to Chapter X any discussion of the geometry.> For any function 
field F in one variable over an arbitrary field k, we shall study in detail those 
discrete valuations of F that are 0 on k. We refer to such discrete valuations as 
the discrete valuations of F defined over k. It will be helpful as motivation to 
remember for the special case in which k is algebraically closed 


e that the members of F may be viewed as all rational functions on the zero 
locus of an affine curve irreducible over k, 

e that the order-of-a-zero function at any nonsingular point of this zero 
locus gives an example of a discrete valuation of F defined over k, and 

e that all discrete valuations of F defined over k arise in this way if the 
zero locus is nonsingular at every point and we take into account points 
at infinity in projective space. 


However, the formal development will not make use of these interpretations. 


3Beware of assuming too much irreducibility about such a curve. Just because / is prime does 
not mean that J remains prime when we extend the scalars and work with an algebraic closure kaig 
of k. For example, X?4-Y? is an affine curve irreducible over R, but it factors as (X +iY)(X —iY) 
over C and is therefore not irreducible over C. 

4This change of context for the word “curve” from the definition in Chapter VIII is appropriate 
because of a change of emphasis: we shall now be studying an associated function field rather than the 
defining ideal. The word “curve” will undergo a genuine change in meaning in Chapter X: because 
of the Nullstellensatz, classical algebraic geometry in the form to be discussed in much of Chapter X 
places emphasis on zero loci defined by prime ideals of polynomials over an algebraically closed 
field, and it will be convenient to define the curve to be the zero locus rather than the defining ideal. 

In Chapter X we shall introduce two distinct notions of sameness for the zero loci under the 
assumption that the field is algebraically closed, namely “isomorphism” and “birational equivalence.” 
The first is a refinement of the second. Birational equivalence will turn out to mean that the function 
fields are isomorphic. An important theorem says that each birational equivalence class of irreducible 
curves contains one and only one isomorphism class of curves that are everywhere nonsingular in 
the sense of Section VII.5. 


530 IX. The Number Theory of Algebraic Curves 


What to expect from the study. When k is not necessarily algebraically closed, 
these interpretations break down, at least to some extent. Yet the main theorem of 
the chapter, the Riemann—Roch Theorem, is still geared to the geometric interpre- 
tation of discrete valuations in terms of poles and zeros. One may reasonably ask 
why one goes to the trouble of working in such a general context that the theory no 
longer has its geometric interpretation. The answer is that the investigation is to 
be regarded as one in number theory, not in geometry. For example, studying an 
affine plane curve over a field I, is the same as studying solutions of congruences 
in two variables modulo a prime. Studying such a curve over the p-adic field Q, 
is the same as studying solutions of such congruences modulo arbitrary powers 
of p. The Riemann-Roch Theorem is actually the first serious aid in making this 
study. The present chapter therefore does not constitute such a study; it merely 
prepares one for such a study. In addition, there is a side benefit to understanding 
the number theory that arises this way: the methods and results of this subject 
and of algebraic number theory have enough in common that the methods and 
results for each suggest methods and results for the other. 

Anespecially tantalizing example of this phenomenon concerns zeta functions. 
The zeros with 0 < Res < 1 for the Riemann zeta function, which is the 
meromorphic continuation to C of ¢(s) = )2) 27 = [1p prime I — po), 
influence the error term in the distribution of the primes as asserted by the Prime 
Number Theorem. The classical Riemann hypothesis is the statement that the 
only such zeros occur on the line Res = 53 it implies a high level of control of 
this error term. There is a corresponding zeta function for any algebraic number 
field, and to it corresponds a version of the Riemann hypothesis appropriate for 
prime ideals for the number field. Proofs or counterexamples for these versions 
of the Riemann hypothesis have been sought for more than a century. 

Meanwhile, one can formulate a Riemann hypothesis for any function field 
in one variable over any finite field, and again the statement has consequences 
for the distribution of prime ideals. This time, however, the Riemann hypothesis 
is a theorem, stated and proved by A. Weil in 1940. One might hope that the 
methods used for Weil’s theorem could shed enough light on the classical Riemann 
hypothesis to lead to a proof, but to date this has not happened. 


Key observation to be used during the study. In the next section we shall 
make systematic use of the following construction for any function field F in 
one variable over the field k. If x is any element of F transcendental over k, 
then the only discrete valuations of F defined over k that take a nonzero value 
on x may be described as follows. Let R be the integral closure of k[x] in F, 
and let R’ be the integral closure of k[x~!] in F. Then R and R’ are Dedekind 
domains by Corollary 7.14, whether or not F is a separable extension of k(x). 
Both have F as field of fractions. Let the ideals xR of R and x—!R’ of R’ have 


2. Divisors 531 


prime decompositions xR = P;'--- Py’ and x7!R’ = vee OF Then the 
valuations vp, for 1 <i < g and VQ; for 1 < j < g’ defined by P; and Q; have 
up, (x) = e and vg,(x) = ect and no other discrete valuation of F that is defined 
over k takes a nonzero value on x. This observation follows from Corollary 6.10 
and the definition of the discrete valuation associated with a nonzero prime ideal 
in a Dedekind domain. 


2. Divisors 
Let k be a field, and let F be a function field in one variable over k. The first step 


is one of normalization: there is no loss of generality in replacing k by the larger 
field k’ of all elements F that are algebraic over k.° 


Proposition 9.1. Let F be a function field in one variable over k, and let k’ be 
the subfield of all elements in F algebraic over k. If x is in F*, then every discrete 
valuation of F defined over k vanishes on x if and only if x is ink’. Consequently 
F is automatically a function field in one variable over k’, and as such, its discrete 
valuations defined over k’ coincide with its discrete valuations defined over k. 


PRrooF. If x € F is transcendental over k, then the observation at the end 
of Section 1 produces discrete valuations of F defined over k that take nonzero 
values on x. Conversely if x € F%* is algebraic over k, we argue by contradiction. 
We may assume that x 4 0. Suppose that v is a discrete valuation of F defined 
over k such that v(x) 4 0. Possibly replacing x by x~', we may assume that 
v(x) > 0. Being nonzero algebraic over k, x satisfies a polynomial equation 


Amx™ + Gm ix™ 14... +ayx +ap =0 
with all a; € k and with aj 4 0. For each j with a; 4 0, we have v(ajx/) = 
v(aj) + ju(x) = jv(x) > 0. If a; = 0, then v(ajx/) = co > 0. Thus 
V(AmxX”™ +am—1x"—!+-+-+a,x) > 0. Since v(ap) = 0, property (vi) of discrete 
valuations in Section VI.2 shows that 

U((Amx”™ + dmx"! +++» + a1x) + a9) = (ao) =0 4 00 = v0), 
contradiction. 

The conclusions in the last sentence of the proposition now follow: Since F 
is generated over F by finitely many elements x), ...,x,, it is generated over 
k’ by the same elements. Moreover, any element of F transcendental over k is 
transcendental over k’, since k’ is algebraic over k. Thus F is a function field in 
one variable over k’. The first paragraph of the proof shows that every discrete 
valuation of F defined over k is defined over k’, and the converse statement is 
immediate from the definition. 


©The field k’ is called the field of constants by some authors. 


532 IX. The Number Theory of Algebraic Curves 


In accordance with Proposition 9.1, there is no loss of generality in replacing 
k by k’ throughout. Changing notation, we assume henceforth that F is a function 
field in one variable defined over k and that every element of F not in k is 
transcendental over k. These hypotheses will not be repeated for each result. 

Suppressing k in the notation, we denote by V; the set of all discrete valuations 
of F defined over k. A divisor is any member of the free abelian group Dy on 
Vr. Elements of Dg will be written additively,’ and thus a typical member of Dg 


is 
A= > NyV 
veVr 
with only finitely many of the integers n, nonzero. We write ord, A for the integer 
ny, calling it the order of A at v. The identity element of Dp is called zero and 
is denoted by 0. 
Each x in F* defines a principal divisor (x) by the formula 


(eae >. v(x)v. 


veVEr 


We verify that (x) is indeed a divisor by showing that v(x) is nonzero for only 
finitely many v in Vp. For x ink, v(x) = 0 forall v. All other x are transcendental 
over k, and the observation at the end of Section | shows that exactly g + g’ 
members of Vp are nonzero on x, where g and g’ are certain positive integers 
depending on x. 

It is sometimes convenient to decompose (x) as a particular difference of two 
divisors, writing (x) = (x)o — (%)oo with 


(x)o= DP very and (oo = DY (v(a))v. 
early a) 


This notation is motivated by the interpretation of (x) for the case k = C, which 
is discussed in an example below. 

Because of the formula v(xy) = v(x) + v(y), the set of principal divisors is a 
subgroup Pr of Dr, and the mapping x +> (x) is a group homomorphism of F” 
onto Pr. The quotient Cp = Dy/ Pr is called the group of divisor classes of F 
over k. 


EXAMPLE. k = C. This is the setting of a compact Riemann surface, provided 
we take for granted that every compact Riemann surface can be realized as a 
nonsingular projective curve over C. The field F is the field of global meromorphic 


7Some authors use a multiplicative notation. 


2. Divisors 533 


functions on the surface. A principal divisor can be viewed as a compilation of 
the orders of the zeros and poles of a nonzero global meromorphic function: each 
member of Vp corresponds to a point of the surface, and the order of a principal 
divisor (x) with x € F%* at a point is positive if the meromorphic function x has 
a zero at the point, negative if x has a pole there. It is known that the sum of the 
orders of all the zeros of a nonzero global meromorphic function equals the sum 
of the orders of all the poles. In the current framework the statement is that the 
sum over u(x) is 0 for every x € FX when k = C. 


Theorem 9.3 will generalize the fact about compact Riemann surfaces that 
vev, UX) = 0 for every x ¢ F* when k = C. When C is replaced by a more 
general field that is not necessarily algebraically closed, Proposition 6.9 already 
shows that the terms v(x) in the corresponding sum have to be weighted by certain 
integers in order to yield sum 0. These integers are dimensions that are shown to 
be finite in the next proposition. 


Proposition 9.2. Let v be any discrete valuation of F defined over k, let R, be 
the valuation ring, and let P, be the valuation ideal. Then R, and P, are k vector 
spaces, and dim, R,/P, is finite. 


REMARKS. The integer f, = dim, R,/P, is called the residue class degree of 
the valuation v. The proof gives a method for computing f,, and we shall make 
use of this method shortly in proving Theorem 9.3. 


PROOF. The fact that R, and P, are k vector spaces is immediate from 
Proposition 9.1. Since v is not identically zero, there exists some x € F with 
v(x) # 0, and x is transcendental by Proposition 9.1. Possibly replacing x by 
x—!, we may assume that v(x) > 0. The observation at the end of Section 1 
classifies those members of Vp taking positive values on x. In that notation we 
decompose (x)R as Py! --- P;s , and v is the valuation defined by P; for some j. 
Theorem 6.5e shows that R,/P, = R/P;. Since x is prime in k[x], the general 
theory of extensions of Dedekind domains shows that P; Nk[x] = xk[x] and that 
fj = dimypxj/(x) R/P;) is finite. The field k[x]/(x) is isomorphic to k, and thus 
the dimension over k of Ry/ Py = R/P; is fj. 


The degree of a divisor A is the integer deg A = neve Ff, ord,(A), where 
Jy is the residue class degree of v as defined in the remarks with Proposition 
9.2. Degree is a homomorphism of Dr into Z. We shall prove in Theorem 
9.3 that principal divisors have degree 0. This result extends Proposition 6.9, 
which handles the special case of the function field k(x). Theorem 9.3 may be 
regarded as a function-field analog of the Artin product formula (Theorem 6.51) 
for number fields, but the proof is much easier for function fields because we can 
take advantage of the observation at the end of Section 1. 


534 IX. The Number Theory of Algebraic Curves 


Theorem 9.3. The degree of every principal divisor is 0. In more detail, if (x) 
is a principal divisor with x not in k, then deg(x)o = deg(x)oo = dimy,) F, and 
hence deg(x) = deg(x)o — deg(x)~ = 0. 


PRooF. If x is in k*, then Proposition 9.1 shows that v(x) = O for every 
v € Vz, and hence deg(x) = 0. Thus we may assume that x is transcendental 
over k. Applying the observation at the end of Section 1 and using the notation 
from there, we know that the only v’s for which v(x) ¥ 0 are the ones relative to 
the prime ideals P; of R and the prime ideals Q; of R’ such that 


eg 


R= PO... P d xt R = 0%... 0% 
xXR= Py'--- Pe an xR =Q,'---Qe. () 


Moreover, vp, (x) = e; and vg,(x) = —é.. In addition, the proof of Proposition 
9.2 showed that the respective residue class degrees are the usual indices f; and 
fj associated to the decompositions (*). Thus 


gi BF 
deg(x)o = > fie and desis =>. fie. 
i=l j=l 
Two applications of Theorem 9.60 of Basic Algebra show that 
8 8 
Y fie; = dimyc,) F and Y fe; = dimy,-1y F. 
i=l j=l * 


Thus deg(x)o = dimy,,) F, and deg(x)9. = dimy,-1) F. The theorem therefore 
follows from the fact that k(x) = k(x7~!). 


Let Dgo be the subgroup of all divisors of degree 0. Theorem 9.3 shows 
that Pr C Dro. The quotient Cro = Dpo/Pr is therefore a subgroup of 
Cr = Dg/Pr and is the group of all divisor classes of degree 0. This is a 
function-field analog of the class group for an algebraic number field; it can be 
shown to be finite if k is a finite field but it not if k is an arbitrary field. 


3. Genus 


In this section, F denotes a function field in one variable over a field k, and we 
assume that every element of F outside k is transcendental over k. We continue 
with the notation Vp, Dr, f,, ord, A, deg A, and (x) forx € F*, all asin Section 2. 

If we were studying only what happens with k = C, we would be interested 
in the vector space of all meromorphic functions whose poles are limited to a 
certain finite set of points and are limited to some particular order at each of those 


3. Genus 535 


points. The underlying compact Riemann surface is an ordinary closed orientable 
2-dimensional manifold, and the dimensions of these spaces of meromorphic 
functions turn out to control the genus of this manifold. For general k, we study 
the natural generalization of this situation.’ The vector spaces of interest are 
defined in terms of divisors, and we will be led to a natural definition of genus of 
the curve under study. 

We introduce a partial ordering on Dr by saying that two divisors A and B 
have A < B if ord, A < ord, B for all v € Vp. The inequality B > A is to 
mean the same thing as A < B. If A < Band A’ < B’, then A+ A’ < B+ B’ 
because ord,(A + A’) = ord, A+ord, A’ < ord, B + ord, B’ = ord,(B + B’). 
If A < B, then —A > —B. 

For each divisor A, we shall study the k vector space 


L(A) = {0} U {x € F* | (x) = —A} = {x € F| v(x) = — ord, A}. 


For x 4 0, we can think of v(x) as telling the order of the zero of x at a point 
corresponding to v. In that spirit, if A > 0, then L(A) consists of all functions 
whose poles are limited to the set of v’s for which ord, A # 0, with the order of the 
pole bounded above by the number ord, A. For general A, a similar interpretation 
is valid, except that the members of L(A) are required also to vanish at certain 
points at least to certain orders. 

We shall suppress any name for the function that embeds Vp in Dp. Thus 
for example if vp is in Vp, then L(vo) refers to L(A) for the divisor A such that 
ord,, A = | and ord, A = 0 when v F vp. 


Corollary 9.4. L(0) = k, and L(A) = Oif A is anonzero divisor with A < 0. 


PRooF. If A < 0 is nontrivial and if x € F* were to have (x) > —A, then we 
would have deg(x) > — deg A > 0, incontradiction to the conclusion deg(x) = 0 
of Theorem 9.3. Thus L(A) = 0. Next, we have 


L(0) = {x € F* | v(x) =O forallx}U U L(-v). 


ve Ver 


The first term on the right side is k*, and the second term gives 0 by what we 
have just proved. Hence L(0) = k. 


If A < B, then it follows from the definition that L(A) C L(B). We shall 
be interested in how much L(B) increases when B increases. This change is 
measured by what happens to the quotient space L(B)/L(A). The key case is 
that B = A+ vo for some vp € Vy, and we treat that in the following lemma. 


8In doing so, we follow the approach in the book by Villa Salvador, Chapter 3, but with different 
notation. 


536 IX. The Number Theory of Algebraic Curves 
Lemma 9.5. If A is a divisor and vo is in Vz, then 
dimy, L(A + vo)/L(A) < fay = deg vo. 


PROOF. Put f = f,,, let R,, be the valuation ring of vo, and let P,, be the 
valuation ideal of vp. Since vp carries F* onto Z, we can choose an element 
y € F* with vo(y) = ord,, (A + v9). 

Let f + 1 members x1, ...,x¢41 of L(A + up) be given. We shall produce an 
equation of linear dependence among the cosets x; + L(A), and this will prove 
the lemma. Computation gives 


vo(xiy) = vo(xi) + vo(y) = vo(%;) + ord,, (A + vo) = 0 


forl1 <i < f +1, since x; is in L(A + vo). Hence x;y is in R,,. Since 
dimy(R,,/ Py) = f, there exist members c),...,¢¢41 of k not all 0 such that 
a Ci(XiV+Py,) = Py, i-e., such that eae cix;y liesin P,,. Then aay CHK; 
lies in ya. Py, and 


f+il 
vo( >> cixi) = —vo(y) + 1 = — ord,,(A + v9) +1 =—ord, A. (*) 


i=1 


Since each x; is in L(A + vo), so is ye c;x;. This fact and («) together show 
that ye cjx; isin L(A), ie., that ~ cix; + L(A) is the 0 coset. This proves 


l 


the desired linear dependence and shows that dim, L(A + vo)/L(A) < f. 


Theorem 9.6. If A and B are divisors such that A < B, then L(B)/L(A) is 
finite-dimensional over k with 


dim, L(B)/L(A) < deg B — deg A. 


Moreover, L(A) and L(B) are separately finite-dimensional over k, and conse- 
quently 
dim, L(B) — deg B < dim, L(A) — deg A. 


REMARKS. We define €(A) = dim, L(A). This is finite by the theorem, and 
the resulting inequality of the theorem is that 


£(B) — deg B < €(A) — deg A. 


PROOF. The first conclusion is immediate from Lemma 9.5 by induction 
on >>, (ord, B — ord, A). Fixing a reference point vp in Vg and taking A = 


3. Genus 537 


Dae B<0 (ord, B)v — vp and applying Corollary 9.4 to A, we see that L(A) = 0. 
Therefore the first conclusion specializes to 


dim; L(B) — deg B < — deg A. 
Since dim; L(B) is certainly nonnegative, this inequality implies that L(B) is 
finite-dimensional. Then we can expand the left side of the first conclusion of the 


theorem to obtain 


dim, L(B) — dim, L(A) = deg B — deg A, 


and the proof is complete. 


The theorem identifies €(B) —deg B as a quantity of interest when we are trying 
to understand a divisor B. We shall undertake a study of this quantity, beginning 
first with the case of a divisor B equal to a multiple of the pole part (x) of a 
principal divisor (x). Recall that the signs are arranged to have (x). > 0. 


Lemma 9.7. For each x in F that is not in k, there exists a constant C, such 
that the multiple p(x)oo of (X)oo satisfies 


£(p@)oo) — deg (Doo) = Cr 


for every integer p. 


PROOF. Applying the observation at the end of Section 1, we form the integral 
closure R of k[x] in F and the integral closure R’ of k[x7!] in F. The discrete 
valuations v for which u(x) < 0 are exactly those arising from prime ideals in 
the prime decomposition of x~'k[x~'], according to Corollary 6.10. Specifi- 
cally the ideal x~'k[x~!] in R’ decomposes as a product on ee 0. and the 
corresponding discrete valuations have vg, (x!) = e,. Theorem 9.3 shows that 
deg(x)oo = dimy x) F. 

Let n = dimy,,) F. Choose a basis yj, ..., yn of F over k(x) consisting of 
members of R. Each v arising from a prime ideal of R has v(y;) = O for 
1 < j <n by Proposition 6.7. The remaining v’s all have v(x) < 0, and 
therefore there exists an integer k > O such that v(y;) > ku(x) for < j < nand 
for all these remaining v’s. For this value of the integer k, the elements y;,..., y, 
all lie in L(K(x) 0). 

Let m > 0 be arbitrary. The v’s coming from some Qj, 1.e., those with 
v(x) < 0, have v(x!) > v(x") whenever 0 < i < m, and the remaining v’s, 
ie., those with v(x) > 0, all have v(x') > 0 forO0 < i < m. Therefore 
1,x,x7,...,x” all lie in L((x™)o0) = L(M(X)oo). 


538 IX. The Number Theory of Algebraic Curves 


Multiplying, we see that x'y; lies in L((k + M)(X)oo) for 0 < i < m and 
1 < j <n. These elements x! y; are linearly independent over k, and therefore 


€((k +m) (x)oo) = (m + 1)n = (mn + 1) deg(x)o0. 


Since deg is a homomorphism from Dr into Z, 
deg ((k + m)(x)oo) = (k +m) deg(x)oo. 
Therefore each m > O has 
€((k + m)(x)oo) — deg ((K + m)(X)oo) = (m+ 1 = k — m) deg(*) oc 
= (1 — k) deg(x)oo. 
We have therefore proved that 
£(q(X)oo) — deg(q()oo) 2 (1 — k) deg(*)oo 


for all integers g that are sufficiently positive. If p is any integer, we can find g 
as above with p < qg. Then p(X)oo < g(X)oo, and Theorem 9.6 shows that 


(1 — k) deg(x)oo  €(4(%)oo) — deg(q (oo) S €(P(*)oo) — deg(p(%)oo). 


This proves the lemma with C, = (1 — k) deg(x)oo. 


Lemma 9.8. If A is any divisor and x is any member of F*, then L((x)+ A) = 
L(A) canonically. Therefore €((x) + A) = €(A). In addition, deg((x) + A) = 
deg A. 


PRrooF. Define a k linear mapping g : L(A) > F by g(y) = x7!y. This is 
certainly one-one, and its image is contained in L((x) + A) because any nonzero z 
in L(A) has (z) > —A and then also (x~!z) = —(x)+(z) => —(x)— A. Similarly 
w(y) = xy is one-one and carries L((x) + A) into L(A). By inspection, yg = 1 
and gy = 1. Therefore L((x) + A) and L(A) are canonically isomorphic 
and have the same dimension over k. For the last conclusion, deg((x) + A) = 
deg(x) + deg A = deg A by Theorem 9.3. 


Theorem 9.9 (Riemann’s inequality). For each x in F that is not in k, let g, 
be the integer such that 1 — g, is the largest possible C, with 


€(P(X)oo) — deg (P(X)oo) = Cx 


for every integer p. Then 
(a) the integer g = g, is independent of x, 
(b) gis>0, 
(c) £(A) — deg A > 1 — g for every divisor A. 


3. Genus 539 


REMARKS. The integer g, in the theorem exists by Lemma 9.7. Once it has 
been proved to be an integer g independent of x, it is called the genus of the 
function field F over k. 


PROOF. We begin by proving (c) with g replaced by g,. Let C, be any integer 
with the property that €(p(X)oo) — deg(p(*)oo) = C, for all p. If a divisor A 
is given, we can write A = Ap — Aw, where Ap = aa Aso (ord, A)v and 
Ass ee. a<o (— ord, A)v. Then A < Ao, and Theorem 9.6 shows that 
£(A) — deg A > £(Ag) — deg Ao. Thus it is enough to prove (c) for Ap. Let p be 
any integer > 0. Since Ag > 0, we have p(X)oo — Ao < p(X)oo. Hence a second 
application of Theorem 9.6 shows that 


&(P(x)oo — Ao) — deg (p(2)o0 — Ao) = &(P()oc) — deg (PC)oo) = Cx. 
Since deg is a homomorphism, this inequality implies that 
€(P(X)oo — Ao) = Cx + pdeg(x)oo — deg Ao. 
Fix an integer p large enough for the right side to be positive. For this p, the 
vector space L(p(X)oo — Ao) is nonzero; let y be a nonzero member of it. This 


y has (y) > —(p(X)eo — Ao), and hence p(x)oo = Ao — (y). A third application 
of Theorem 9.6, in combination with Lemma 9.8, shows that 


€(P(*)oo) — deg (p(X)oo) < €(Ao — (y)) — deg(Ao — (y)) 
= £(Ao) — deg Ao. 


The left side is > C,., and hence £(Ao) — deg Ao > C,. Therefore 
£(A) — deg A > C, (x) 
for every divisor A. Since one choice of C,, is 1 — g,, this proves (c). 
Taking A = p(y)oo, we see that the best C, has C, > C,;. Since the roles of 


x and y can be interchanged, this proves (a). Finally if we take A = 0 in (c) and 
apply Corollary 9.4, we see that 1 — 0 > 1 — g. Thus g > 0. This proves (b). 


EXAMPLES OF GENUS. 


(1) F = k() for a transcendental x. In the proof of Lemma 9.7, we have 
n = 1 andcan take y; = 1. Then k = 0, and the proof of the lemma shows that 
the inequality of the lemma holds with C, = (1 — 0) deg(x)oo. = 1. Therefore 
1—g>C, =1,and g < 0. So g = 0 by Theorem 9.9b. 


540 IX. The Number Theory of Algebraic Curves 


(2) F = C[x, y]/(y* —x* +1). This example was discussed in Section 1, and 
we have x~!R’ = P; Py with P, = (x7~!, x7?y + 1) and Py) = (x7!,x7*y — 1). 
The corresponding valuations therefore have vp, (x) = vp,(x) = —1. Meanwhile, 
the elements 1 and y form a basis of F over k(x). The element 1 has vp, (1) = 
vp, (1) = 0; so 1 is in L(p(x).o) for every p > 0. Since xy is the sum of a 
generator of P; and a generator of Po, x~7y lies in R’. Write (x~?y) = I, --- Ih, 
where each J; is a prime ideal in R’. Since x~?y and P; together generate 1, 
P; is not one of the ideals J;. Similarly P, is not one of the J;’s. Thus (y) = 
(x7!)-2(a-7y) = (Pi Po)-7, +++ I, and we obtain vp,(y) = vp,(y) = —2. 
Hence y lies in L(2(x)oo), and we can take k = 2 in the proof of Lemma 9.7. For 
this k, we have C, = (1 — 2) deg(x).. = —2. Therefore 1 — g > C, = —2, and 
g <3. In fact, g = 1 here, as a special case of the next example. Thus a routine 
use of the estimate from Lemma 9.7 has its limitations. 

(3) F = kx, y]/(y?— p(x)), where p(x) is a square-free polynomial of degree 
m and k has characteristic 4 2. Then g = sm —1ifm is even and g = s(m —1) 
if m is odd. This computation will be carried out in Problems 12—20 at the end 
of the chapter. 


Theorem 9.9 gives the lower bound of 1 — g for (A) — deg A for all divisors 
A. There is also an upper bound, with the proviso that L(A) 4 0. 


Proposition 9.10. If A is any divisor such that L(A) 4 0, then 
£(A) — deg A < 1. 


Hence any divisor A with deg A < —1 has £(A) = 0. 


PROOF. Let y be a member of F™ that lies in L(A). Then every v € Vp has 
v(y) => —ord, A and hence 0 > —ord, A — v(y) = —ord,(A + (y)). This 
inequality says that A + (y) > 0. Then Corollary 9.4 and Theorem 9.6 together 
give 


1 = £(0) — deg0 > &(A + (y)) — deg(A + (y)), 


and the right side equals (A) — deg A by Lemma 9.8. Then 1 — degA < 
£(A) — deg A < 1, and we must have deg A > 0 whenever £(A) > 1. 


4. Riemann—Roch Theorem 


Riemann’s inequality, proved in Section 3, shows that every divisor A satisfies 
£(A)—deg A > 1—g, where g is the genus of the curve in question. The Riemann— 
Roch Theorem, to be proved in the present section, gives an interpretation for the 
difference between the two sides of the inequality. 


4. Riemann—Roch Theorem 541 


In the classical setting of compact Riemann surfaces, the proof of the Riemann— 
Roch Theorem makes use of meromorphic differential forms, sometimes called 
abelian differentials by complex analysts. Meromorphic differential forms are 
objects that locally look like f(z) dz, where z is a local coordinate and f(z) 
is a meromorphic function, and that fit together to be globally defined on the 
complex manifold. What the formula f(z)dz = g(w) dw for fitting together 
means that in the overlap of the regions for two local coordinates z and w, 
f(z) dz = g(w(z)) oe dz holds and hence f(z) = g(w(z)) ae In the language 
of differential geometry, a meromorphic differential form is a meromorphic sec- 
tion of the cotangent bundle of the complex manifold. An important step that 
has to be carried out to make these differential forms useful is to prove a version 
of the Residue Theorem. This theorem says that the sum over all points of the 
manifold of the residues of the differential form is 0, the residue of f(z) dz at 
the point corresponding to z = 0 being the coefficient of z~! in the Laurent 
expansion? of f(z) about 0. Once this theorem is in hand, one can begin to prove 
the Riemann—Roch Theorem. 

In our present setting with the function field F in one variable over k, it is not 
too hard to define an analog of meromorphic differential forms and to establish 
that they behave the way one would expect from differential calculus. In order 
to make use of these forms, one has to prove an analog of the Residue Theorem, 
and doing so requires some hard work. A. Weil discovered that this construction 
could be bypassed and that one could prove the theorem directly. The idea is to 
introduce the tool that differential forms make available and to skip the differential 
forms themselves. 

It is worth understanding this background in a little more detail because oth- 
erwise the proof below may seem very strange indeed. To fix the ideas for this 
background only, suppose that the base field k is algebraically closed. Let us 
recall that elements of Vp are meant to correspond to points of a zero locus in 
projective space, at least when the curve is everywhere nonsingular. We write 
this correspondence as v +> p(v). A local coordinate about p(v) is denoted 
by a symbol like z classically, and in the setup with valuations, it is simply a 
member of the valuation ideal of v with v(z) = 1. A differential form that is 
given locally by classical expressions like f(z) dz attaches to each v in Vp the 
function g, +> Residuepiy)(g, f dz), where g, is any Laurent expansion about 
p(v). 

Classically this Laurent expansion is to be convergent in some deleted neigh- 
borhood of p(v), and it involves only finitely many negative powers of the 
local coordinate. The assumption that it converges is not important because 
if v(f) =n, then the only powers of z whose coefficients in g, affect the residue 
at p(v) are the k™ powers for k +n < —1. Thus the assumption on g, is that it is 


°One has to show that this coefficient is independent of the choice of the local coordinate. 


542 IX. The Number Theory of Algebraic Curves 


a member of the Laurent series field k((z)). To compute the residue for g, f dz, 
we need to know how to interpret f(z) as a Laurent series about p(v). Let R, 
be the valuation ring of v, and let P, be the valuation ideal. The field R,/P, is 
a finite extension of k and must be isomorphic to k because k is algebraically 
closed. For each c € k, choose a member a, € R, such that the coset a + Py 
corresponds to c; we may assume that aj = 0. Denote the set of these elements 
ac by Ry. If v(f) = n, thenh = z-"f is in Ry, and thus some unique ao in 
R,, has the property that h — ag is in P,. Hence z~!(h — ao) is in R,, and some 
unique a; in Ry, has the property that z~!(h — ao) — a; is in P,. From this, 
z!(z7!(h — ap) — a) is in R,, and we can continue to subtract members of 
Rx and divide by z in this way. The result is that h = ag + ayz + az? +++: 
in the sense that v(h — ag — ayz — «+: — agz*) > k + 1 for every k. Therefore 
f=2h =z" (aotaizt+anz? +--+). If we replace each a, by the corresponding 
member cx of k, then z” (co + c1z + coz? +--+) is the member of k((z)) that we 
associate to f. 

With this identification in place, we can regard the given differential form as 
yielding a k linear function 


Residue : I] k((z)) > I] k. 


ve Vr veVEF 


We want to cut down the domain of this mapping so the sum of the residues is 
meaningful for every member of the image. The local expressions f(z) dz involve 
only finitely many poles in a neighborhood of each point, and compactness implies 
that there are only finitely many such points globally. Except at these points the 
residue of g, f dz can be nonzero only if g, has a pole at p(v). Thus we can 
ensure that the sum of the residues is meaningful if we assume that v(g,) > 0 
except for finitely many v. 

For algebraic purposes the domain is still unnecessarily large. Since each 
local coordinate in the algebraic realization is actually a member of F, the only 
members of k((z)) that we need to handle at each point are the members of F. 
So let Az = [lev, F, and let Ap be the k subspace of all members {g,} of the 
product such that v(g,) < 0 only finitely often. Then the differential form gives 
us a k linear functional 


Sum of Residues : Ap > k. 


We have seen that if the differential form is given by f(z) dz locally near p(v) and 
if v(g,) => —v(Jf), then the residue is 0 at p(v). Hence there is some divisor A, 
depending on the differential form, such that if v(g,) > — ord, A for all v € Vz, 
then all residues are 0 and the sum of the residues is 0. Consequently the kernel 
of the sum-of-residues map associated to the differential form contains all tuples 
{gv} of Ag such that v(g,) > — ord, A for this divisor A and all v. 


4. Riemann—Roch Theorem 543 


Finally there is one more classical fact to bring into play. This is the Residue 
Theorem itself, saying that the sum of the residues is zero for any meromorphic 
differential form. If {g,} is actually a constant tuple with g, = h forsomeh e€ F, 
then the sum-of-residues map as defined above is giving us the classical sum of 
residues for the product of h and the given differential form. This sum is zero. In 
other words, every member of the diagonally embedded F in Ag lies in the kernel 
of the sum-of-residues map associated to the differential form. 

Weil’s idea in a nutshell is that instead of developing differential forms, working 
with residues, and proving the consequence of the Residue Theorem, one should 
just start with any abstract linear functional on Af that satisfies the conditions 
that we noted above. Then the Riemann—Roch Theorem drops out fairly easily. 
This is the approach we shall follow. The abstract kind of linear functional on 
Ar will be called a “differential” in what follows, as a reminder of the classical 
object that lies behind it.'° 


Without further ado, we proceed with the Riemann—Roch Theorem. In this 
section, F denotes a function field in one variable over a field k, and we assume 
that every element of F outside k is transcendental over k. We continue with the 
notation Vg, Dr, f,, ord, A, deg A, and (x) for x € F%, all as in Sections 2-3, 
and with the notation L(A) and £(A) as in Section 3. If A is a divisor, we let 


5(A) = €(A) — deg A — (1 — g). 


Riemann’s inequality (Theorem 9.9) implies that 5(A) > 0 for all A’s and that 
5(A) = 0 for some A’s. We seek an interpretation of 5(A). 

Let Aj be the ring of all functions from Vy into F, with the operations taken 
pointwise. It is customary to write such a function € as v +> &, rather than as 
v +> &(v). Let Ag be the subring!' of all members & of A® such that v(é,) < 0 
for only finitely many v in Vp. We shall treat Ap as an infinite-dimensional 
associative k algebra with identity. 

Consider the diagonal map A : F > Ag defined by the formula A(x), = x for 
all x € F. Under this map, the member x of F goes to the function whose value 
at each v is x. The reason that A(x) is in Ag and not just Aj is that v(x) < 0 for 
only finitely many v € Vg. The map A is a one-one k algebra homomorphism. 


'0Weil’s argument dates to 1935. It appears in book form in Weil’s Basic Number Theory, where 
the details are carried out when k is a finite field and where comments are made for general k. Lang 
simplified Weil’s argument and wrote it down for algebraically closed fields k in his Introduction 
to Algebraic and Abelian Functions. A version of this argument for general k appears in Villa 
Salvador’s book. The present exposition benefits from all three of these books. 

'lFor readers familiar with Section VI.10, the notation is intended to hint at “adeles” of F. 
However, completions and topologies will play no role in the construction. 


544 IX. The Number Theory of Algebraic Curves 
For each divisor A, define 
L(A) = {€ € Ag | v(&,) = — ord, (A)}. 
It is immediate from the definitions that 
L(A) N ACF) = A(L(A)). 
Let us see that 
A<B if and only if L(A) C L(B). 


In fact, the “only if” part of the statement is evident. Conversely suppose that 
L(A) © £(B). Choose for each v € Vp an element 7, in F with v(z,) = 1. The 
function 4 : Vp > F defined by (4), = 2~ 4 has v((E4)v) = —ordy A 
and lies in Ap, since ord, A is nonzero for only finitely many v. The definitions 
show that €4 lies in L(A), hence in £(B). Thus —ord,(A) = v((E4)y) = 
—ord, B, ord, A < ord, B, and A < B. This proves the “if” part of the 
displayed equivalence. If we apply the equivalence twice, we see that 


A=B if and only if L(A) = L(B). 


Let us take note of two operations on divisors A and the effect of these oper- 
ations on the spaces £(A). If A and B are divisors, we define C = min(A, B) 
pointwise by the formula ord, C = min(ord, A, ord, B). Then C is a divisor 
with C < AandC < B. Thus £(C) C L(A) and L(C) C L(B), and we 
consequently obtain 


L(min(A, B)) C L(A)N L(B). 


Still with A and B as divisors, we define C = max(A, B) pointwise by the 
formula ord, C = max(ord, A, ord, B). Then A C C and B C C, from which 
we obtain £(A) C L(C) and £(B) C L(C). This proves the inclusion C in the 
identity 

L(A) + £(B) = L(max(A, B)). 


To prove >, let € be in £(max(A, B)). We shall decompose € asa sum 7 + ¢ in 
L(A) + £(B) with one of 7, and ¢, equal to 0 for each v. Let v be given. Since 
€ isin £(max(A, B)), v(é,) > — ord,(max(A, B)) = — max(ord, A, ord, B). 
That is, —v(&,) < max(ord, A, ord, B). If —v(&,) < ord, A, then define n, = &, 
and ¢, = 0; otherwise, we have —v(&,) < ord, B, and we define n, = O and 
Cy = &,. Then v(ny) = — ord, A for all v, and v(¢,) > — ord, B for all v. This 


proves > in the displayed formula. 


4. Riemann—Roch Theorem 545 


Lemma 9.11. If A and B are divisors with A < B, then 
dim, (£(B)/L(A)) = deg B — deg A. 


PROOF. Proceeding inductively, we see that it is enough to handle the case that 
B = A+ vo, where vo is in Vg. Thus we are to show that 


dim, (L(A + v9)/L(A)) = fry = deg(vo). (*) 


Put f = fry, let R,, be the valuation ring of vo, and let P,, be the valuation ideal 
of vo. To prove < in (*), we argue as in the proof of Lemma 9.5. Since vp carries 
F* onto Z, we can choose an element y € F* with vo(y) = ord,,(A + v9). 

Let f + 1 members €",..., +) of L(A +4 vp) be given. We shall produce 
an equation of linear dependence among the cosets € + L(A), and this will 
prove < in (*). Computation gives 


vo (EM y) = vo EM?) + vo(y) = woE) + ord, (A + v9) = 0 


for 1 < i < f +1, with the inequality at the right holding because &“) is 
in L(A + vo). Hence EWy is in R,,. Since dimy(R,,/P.,) = f, there exist 


members c,,..., ¢¢+1 Of k not all O such that t pan 6 ci(& EWy + Py) = Pr, Le., 
such that vit c§y lies in P,,. Then ye af  Gé@ lies in y~! P,,, and 


f+ 
vo ( Cj EO) = —vo(y) + 1 = — ord, (A + vp) + 1 = — ord, A. (**) 


i=1 

Since each €") is in L(A + vo), So is a cj€\. This fact and (+) together 
show that as ci€ isin L(A), ie., that ye cjé + L(A) is the 0 coset. This 
proves the desired linear dependence and shows that dim, £(A + v9)/L(A) < f. 

To prove > in (*), we shall produce f members €/) of £(A + vo) that are 
linearly independent modulo £(A). We begin by choosing 7 in £(A) with 
vo(Nv,) = —ord, A. (For example take any member 7’ of L(A), change He 
to a new value on which vo takes the value — ord,, A, and leave 7’ unchanged at 
all other v.) Let x;,..., xf be a set of representatives in R,, of the f members of 
ak basis of the quotient R,, /P,,, and let 7, be a member of F with vo (z,,) = 1. 
Define €” for 1 < j < f by 


£0) = Nv for v £ vo, 
i Hu for v = Uo. 


For each j, we have 


Vo (Ney XjT!) = Vo(Mwy) + V(xj) — VOC) 
= —ord, A+ v(x;) — 1 => —ord,, A—1, 


546 IX. The Number Theory of Algebraic Curves 


and thus €/ is in L(A + vg). To prove the linear independence modulo £(A), 
suppose that c;,..., cy are members of k such that Dees cj) is in L(A). In 
this case we have an inequality vo( Ee cjé G )) > —ord,, A, which expands out 
as 
u 1 
vo( D> Cj MvpX jy) = VolMy)- 
j=! 

Since vp (7, ') = —1, subtraction of Vo(7u) from both sides yields vo( a c i) 
> 1. Therefore oe x cjx; lies in P,,. By the assumed linear independence over 
k of the x;’s modulo P,,, all the c;’s are 0. Therefore the elements & Y are linearly 
independent modulo £(A), and the proof of > in () is complete. 


Lemma 9.12. If A and B are divisors with A < B, then there is an exact 
sequence in the category of k vector spaces given by 


0 —> L(B)/L(A) > £(B)/L(A) 
— (L(B) + A@))/(L(A) + A) — 0. 
Consequently 


dim, (£(B) + A(F))/ (L(A) + A(P)) = (€(A) — deg A) — (€(B) — deg B) 
— 8(A) — 8(B). 


PRrooF. The map w is induced by the map A : L(B) > L(B) followed by 
passage to the quotient. It descends to L(B)/L(A) because A(L(A)) C L(A), 
and it is one-one because A(L(B)) N L(A) C L(A). The map ¢ is induced 
by the map x +> x + A(F) followed by passage to the quotient. It descends 
to £(B)/L(A) because £(A) maps into £(A) + A(F), and it is onto because 
x h x + A(F) carries £(B) onto £(B) + ACF). The composition gy is 0 
because L(B) maps under A into A(F), which lies in the 0 coset. 

To prove the exactness, let € + £(A) be in ker g. This condition means that € 
is in £(B) and has € + A(F) in £(A) + ACF). Thus there exists 7 in £(A) with 
& —7n in A(FP). Since é and n are in £(B), € —n isin L(B)N A(F) C A(L(B)). 
Hence € + L(A) = (€ — n) + L(A) lies in A(L(B)) + L(A) = image yy, and 
exactness is proved. 

From the exactness we obtain 


dim, £(B)/L(A) = dimg L(B)/L(A) + dim, (L(B) + A(F)) /(L(A) + A(P)). 


The left side equals deg B — deg A by Lemma 9.11, and the first term on the right 
side equals £(B) — €(A) by the finite dimensionality of L(B) and L(A), which 
was proved as part of Theorem 9.6. The result follows. 


4. Riemann—Roch Theorem 547 


Theorem 9.13. There exists a divisor C such that Ag = £L(C) + A(F). For 
each divisor A, 


6(A) = dim, (Ap/(£(A) + A@))). 


PROOF. Riemann’s inequality produces a divisor C, specifically any suffi- 
ciently large positive power of a divisor (X)oo, such that 5(C) = 0. If we can 
show that Ar = L£(C) + A(F), then the dimensional equality in Lemma 9.12 
with B = C will complete the proof of the present theorem. 

Suppose that there exists a member & of Ap that is not in £(C) + A(F). For 
each v € Vz, let a, = min(v(é,), — ord, C), and define C’ = — ies Ayv. 
Since & is in Ag, only finitely many integers v(&,) are negative. This fact and 
the fact that C is a divisor together imply that only finitely many a, are negative. 
Since C is a divisor, only finitely many integers — ord, C can be positive, and 
thus only finitely many a, can be positive. Therefore C’ is a divisor. 

The definition of C’ is arranged in such a way that C < C’. Also, every v has 
v(—y) => ay = — ord, C’, and hence & lies in £(C’). Consequently 


dim, (L(C’) + A(F)) /(L(C) + A) = 1. 
By Lemma 9.12, 5(C) — 6(C’) > 1. Since C was assumed to have 6(C) = 0, we 


obtain —5(C’) > 1, in contradiction to the fact that 6(A) > O for every divisor 
A. We conclude that every & in Ag lies in £(C) + A(F). 


Theorem 9.13 gives a first interpretation of the difference 5(A) between the 
two sides of Riemann’s inequality (Theorem 9.9). We shall now apply Theorem 
9.13 and reinterpret 5(A) as the dimension £(B) of a suitable divisor B obtained 
from A, and then we will have obtained the Riemann—Roch Theorem. 

A differential of F is a k linear functional w on Ag with the property that w 
vanishes on £(A) for some divisor A and w vanishes also on A(F). The set of 
all differentials of F will be denoted by Diff(F). Let us observe that Diff(F) is 
a vector subspace of k linear functionals on Ap. Scalar multiplication by k is 
not an issue. To see that Diff(IF) is closed under pointwise addition, let w and 
w’ be differentials vanishing on £(A) and £(B), respectively. We have seen that 
L(min(A, B)) C L(A) N L(B). Thus w + w’ vanishes on £(min(A, B)). Since 
@ + a’ vanishes also on A(F), w + a’ is a differential. 

The k vector space of differentials vanishing on £(A)+ A(K) may be identified 
with the vector space of k linear functionals on the quotient Ag/(£(A) + A(F)), 
and the latter space is finite-dimensional of dimension 6(A) by Theorem 9.13. 
Since a finite-dimensional vector space and its dual have the same dimension, the 
k vector space of differentials vanishing on £(A) + A(K) has k dimension 6(A). 

In addition, Diff(F) carries a scalar multiplication by F that makes it into an 
F vector space. What is required to verify this statement is a definition, and then 


548 IX. The Number Theory of Algebraic Curves 


the verification of the properties of an F vector space is routine. If y is in F and 
w is a differential, we define yw on Ag by (yw)(€) = w(A(y)é). The linear 
functional yw vanishes on A(F) because A is a homomorphism. It is enough to 
check for y 4 0 that 


if @ vanishes on £(A), then yw vanishes on L(A + (y)), 


where (y) is the principal divisor corresponding to y. To prove this vanishing, 
let € be in L(A + (y)). Then v(é,) > — ord,(A + (y)) = — ord, A — ord, (y) = 
— ord, A—v(y), which implies that v(€, y) > — ord, A, which implies that A() 
lies in £(A), which implies that w(€ A(y)) = 0, which implies that (y@)(€) = 0. 
This proves the asserted vanishing, and it follows that Diff(F) carries a well- 
defined scalar multiplication by F. 

Each set L(A), where A is a divisor, will be called a parallelotope of Ar. 
These sets are large subsets of Ap, since dim, Ag/(£(A) + A(F)) is finite and 
dim, Ap/A(F) is infinite. We are going to associate a particular parallelotope 
to each nonzero differential. Since we have seen that distinct parallelotopes 
correspond to distinct divisors, we shall obtain a way of associating a divisor to 
each nonzero differential. 


Corollary 9.14. If w is a nonzero differential and £(A) is a parallelotope in 
its kernel, then 


£(A) < 6(0) and deg A <6(0)+ 8-1. 


Consequently there exists a unique maximum parallelotope on which w vanishes. 


REMARKS. In view of the remarks before the corollary, we therefore obtain a 
function w +> Div(@) from the set Diff(F) — {0} of nonzero differentials into the 
set Dy of divisors. 


PROOF. If we know that €(A) < 6(O), then addition to this inequality of 
Riemann’s inequality deg A — £(A) < g — 1 as given in Theorem 9.9 shows that 


deg A < 5(0) + ¢—-1 


and proves the second inequality. The inequality (A) < 6(O) is trivial if 
L(A) = 0. 

Therefore we may assume in the two inequalities that L(A) 4 0. Let y be 
any nonzero member of L(A). Since the kernel of w contains £(A), the kernel 
of yw contains £(A + (y)), by a computation made above. Meanwhile, the 
element y, being in L(A), has (y) > —A and hence 0 < A+ (y). Therefore 
L£(0) € L(A + (y)), and the kernel of yw contains £(0). Since the kernel of ya 
contains A(F), yw is well defined on the quotient space Ar/(£(0) + A(F)). 


4. Riemann—Roch Theorem 549 


Now suppose that y),..., y, is a k basis of L(A). Let us use the fact that 
wo # 0 to prove that yjw,..., y,@ are linearly independent when viewed on 
Ag/(£(0) + A(F)): If cy, ..., ¢, are members of k not all 0, then z = ae CY; 
is a nonzero member of L(A), and we have just seen that zw is well defined on 
Ag/(£) + A(F)). Then we have )7_, cj) = (27_1 cy) @ = zw, and 
this cannot act as 0 on Ag/(£L(0) + A(F)) without being identically 0 on Ap. 
Since any & such that w(&)) 4 0 has the property that zw(A(z)~!&) & 0, the 
linear functionals yw, ..., y,@ on Ag/(£(0) + A(F)) are linearly independent. 

We know that 6(0) = dim, Ag/(£(0) + A(F)) by Theorem 9.13, and hence 


n = &(A) < 8(0). 


This completes the proof of the two inequalities. 

We turn to the existence and uniqueness of the maximum parallelotope on 
which @ vanishes. We continue to assume that w #4 0. Now suppose that 
A is a divisor such that @ vanishes on £(A). Suppose that B is a divisor for 
which B < A fails and for which w(£L(B)) = 0. We know that the divisor 
max(A, B) has the property that £(max(A, B)) = L(A) + L(B). Since w 
vanishes on £(A) and £(B), it follows that it vanishes on £(max(A, B)). Since 
B < A fails, there exists some vg € Vp with ord, B > ord,, A, and this vo has 
ord,, max(A, B) > ord, A. Thus deg max(A, B) > deg A. 

The second inequality proved above shows that the degree is bounded on all 
divisors whose parallelotopes are in ker w. In finitely many steps we consequently 
arrive at a divisor C with £(C) C ker w such that any divisor B with £(B) C kerw 
has B < C. Then C is the unique maximum divisor on whose parallelotope w 
vanishes. The parallelotope determines the divisor, and the proof of the corollary 
is complete. 


Recall from Section 2 that the additive subgroup Pr of principal divisors within 
the group Dr of all divisors breaks Dr into equivalence classes known as divisor 
classes. The group Cr = Drp/ Pr is the group of all divisor classes. The operation 
of a principal divisor (y), for y € F*, on a divisor A is At» A+ (y). On the 
other hand, we have seen that if a nonzero differential @ vanishes on £(A), then 
yq@ vanishes on £(A + (y)). In the notation of the remarks with Corollary 9.14, 
we therefore have 

Div(vo) = Div(w) + (9). 


A single orbit of nonzero differentials under the scalar-multiplication action on 
Diff(F) by F* thus yields a single divisor class within Dg. We shall show that 
Diff(F) is 1-dimensional as an F vector space. Then the nonzero differentials 
form a single orbit under F*, and the divisors that arise as Div(w) for some 
nonzero differential w form a single divisor class. 


550 IX. The Number Theory of Algebraic Curves 


Lemma 9.15. As a vector space over IF, the space Diff(F) of differentials is 
1-dimensional. 


PROOF. First we prove that Diff(IF) is nonzero. Referring to Theorem 9.13, 
we know that 6(A) = dimy (Ag/(L(A) + A(B))). If 8(A) > 0, then there exist 
nonzero linear functionals on Ag / (L(A) + A(BP)), and the lift of such a nonzero 
linear functional to Ap is a nonzero differential. Thus it is enough to produce a 
divisor A with 5(A) > 0. Fix vo in Vg, and let A = —2v9. Proposition 9.10 
shows that £(A) = 0. Therefore 


6(A) = &(A) —degA-(1—- g) =2+g-1=g+4+1>0, 


and this A has 5(A) > 0. 

Now we shall prove that the F dimension of Diff(F) is at most 1. Arguing by 
contradiction, suppose that w and w’ are differentials that are linearly independent 
over F. If w vanishes on £(A) and w’ vanishes on £(A’), then w + w’ vanishes 
on L(A) N L(A’) D L(C), where C = min(A, A’). Let B be an arbitrary divisor. 
Suppose for the moment that L(B) 4 0. If y 4 0 is in L(B), then (y) > —B, 
and C + (y) > C—B. So L(C + (y)) D L(C — B). We have seen that the 
vanishing of w on £(C) implies the vanishing of yw on L(C + (y)). Therefore 
yw vanishes on £(C — B). Similarly yw’ vanishes on £(C — B). 

Still with L(B) 4 0, let n = €(B), and let x,,..., x, and yj, ..., y, be bases 
of L(B) over k. Then x,@,...,%,@, yj@',..., ¥,@’ are linearly independent 
over k because a relation 
axio + D bjyjw! =0 
i=l j=l 


would mean that the members x = )j_, ajx; and y = )0_, bjy; of F have 
x + yo! = 0. Since w and o’ are assumed to be linearly independent over F, 
x = y = 0. But then a; = 0 for all i and b; = 0 for all j. Consequently we 
can generate 2n linearly independent differentials that all vanish on L(C — B). 
These differentials may be regarded as linear functionals on the k vector space 
Ar/(L£(C — B) + ACF)), whose k dimension is 6(C — B) by Theorem 9.13. 
Consequently 
d(C — B) > 2€(B), 


and this inequality is true also if L(B) = 0, by Riemann’s inequality. Substituting 
from the formula for 6(-), we obtain 


£(C — B) — deg(C — B) —1+ g > 2€(B) 


= 2(deg B+ 1 — g) + 6(B)) 
> 2degB+2-—2¢g 


4, Riemann—Roch Theorem 551 


because Riemann’s inequality shows that 56(B) > 0. Replacing deg(C — B) by 
deg C — deg B gives 


deg B < £(C — B) —degC —3 + 3g. (*) 


Proposition 9.10 shows that £(C — B) < 1+deg(C — B) if €(C — B) £0. In 
this case the two inequalities together give 


2deg B < —2+ 3g; 


hence £(C — B) = 0 if deg B is positive and sufficiently large. Choosing then a 
divisor B with deg B positive and sufficiently large, we have €(C — B) = 0, and 
() gives 

deg B < —degC —3 + 3g. 


Since the right side is fixed and the left side can be made arbitrarily large, we 
have arrived at a contradiction. 


As aresult of Lemma 9.15, the divisors of the form Div(w) for some nonzero 
differential w constitute a single class in the group Cr = Dy/ Px of divisor classes. 
This class is called the canonical class of F, and any divisor in the class is called 
a canonical divisor. 


Theorem 9.16 (Riemann—Roch Theorem). Let F be a function field in one 
variable over a field k, and suppose that every member of F not in k is transcen- 
dental over k. If A is any divisor of F and C is any canonical divisor, then 


(A) =degA+ (1 —g)+€(C — A), 


where g is the genus of F. 


PROOF. Lemma 9.15 shows that there exists a nonzero differential wo. Let 
Co = Div(a@o). Lemma 9.15 shows that C = Co + (yo) for some yo € F*. Then 
@ = yoo has 


Div(w) = Div(yowo) = Div(@o) + Go) = Co + Oo) = C. 


Let B be a divisor to be specified, and consider C — B. Any nonzero differential 
w’ vanishing on £(C — B) is of the form w’ = zw for some z € F* by Lemma 
9.15, and Div(w’) = Div(zw) = C + (z). Therefore £(C + (z)) D L(C — B), 
C + (z) = C — B, and (z) > —B. This inequality means that z is in L(B). 
Conversely if y is any nonzero element in L(B), then (y) > —B and C + (y) = 
C—B.SoL(C+(y)) D L(C — B). We know that yw vanishes on L(C + (y)), 
and hence yw vanishes on £(C — B). 


552 IX. The Number Theory of Algebraic Curves 


Consequently the differentials vanishing on £(C — B) are exactly the dif- 
ferentials yw with y in L(B). Such differentials vanish on A(F) by definition, 
and the space of them is k isomorphic to the space of k linear functionals on 
Ag / (L(C — B)+ A(F))). By Theorem 9.13 the latter space has k dimension 
56(C — B), and hence the space of differentials in question has k dimension 
56(C — B). In short, 

6(C — B) = £(B). 
Since B is arbitrary, we can specialize it to B = C — A. Then we obtain 
£(C — A) = 5(A) = £(A) — deg A— (1 — g), 


and the theorem follows. 


5. Applications of the Riemann—Roch Theorem 


We begin with some immediate applications of the Riemann—Roch Theorem, and 
then we obtain some applications that require arguments that are a bit more subtle. 
Another application appears in the problems at the end of Chapter X. 


Corollary 9.17. If C is any canonical divisor, then £(C) = g. 


PROOF. Put A = 0 in Theorem 9.16, and use the fact given in Corollary 9.4 
that £(0) = 1. 


Corollary 9.18. If C is any canonical divisor, then deg C = 2g — 2. 


PROOF. Put A = C in Theorem 9.16, and apply Corollary 9.17 and Corollary 
9.4. 


Corollary 9.19. Any divisor A with deg A > 2g — 2 has (A) = 0, ie., 
€(A) = deg A+ (1 — g). 

Proof. If degA > 2g — 2, then it follows from Corollary 9.18 that 
deg(C — A) < 0. By Proposition 9.10, €(C — A) = 0. Then the corollary 
is immediate from Theorem 9.16. 


Corollary 9.20. If A is a divisor with deg A = 2g — 2, then either A is a 
canonical divisor and £(A) = g, or Ais not acanonical divisor and £(A) = g—1. 


PROOF. If A is a canonical divisor, then (A) = g by Corollary 9.17. Other- 
wise, the divisor C — A, which has degree 0 by Corollary 9.18, is not a principal 
divisor. Any nonzero y in L(C — A) then would have (y) > —(C — A) and 
0 = deg(y) > —deg(C — A) = 0; hence v(y) = — ord,(C — A) for all v, and 
(y) = C — A, contradiction. Consequently L(C — A) = 0 and £(C — A) = 0. 
Theorem 9.16 now gives £(A) = deg A+ (1—g) = Qg—-2)+(1—-g) =g-1. 


5. Applications of the Riemann—Roch Theorem 553 


EXAMPLES OF CANONICAL DIVISORS. 


(1) Genus g = 0. In Corollary 9.20 with g = 0, the alternative €(A) = 
g — 1 = —1 is impossible, and therefore every divisor with degree —2 is a 
canonical divisor. 


(2) Genus g = 1. In Corollary 9.20 with g = 1, take A = 0. Then €(A) = 
1 = g by Corollary 9.4. So Corollary 9.20 says that the divisor 0 is a canonical 
divisor. 


Corollary 9.21. If vp is in Vg and n > max(2g — 1,0), then there exists a 
nonscalar x in F* with (x)o5 < nvo. 


PROOF. Let A = nvo, and let f,, be the residue class degree of vo. Then 
deg A =nf,, =n > max(2g — 1,0), and Corollary 9.19 gives 


€(A) = deg A+ (1 — g) =nfy + (1 - 8) 
> max(2g — 1,0) + (1 — g) = max(g, 1 —g) > 1. 
Hence €(A) > 2, and L(A) contains a nonscalar element x. This x has 


—n = —ord,, A < ord,,(x) = ord, (x)o — Ord, (X)oo = — OFdy, (X)oo, 


and thus (x)o9 < nv. 


Doubly periodic meromorphic functions on C in the subject of complex analy- 
sis may be viewed as meromorphic functions on some torus,” which is a compact 
Riemann surface of genus 1. The Weierstrass go function for the torus in question 
has a double pole at one point, two zeros, and no other poles or zeros. It is therefore 
a function x with (x)oo = 2uo if up is the discrete valuation corresponding to the 
location of the pole. Hence this x provides an example with equality holding in 
Corollary 9.21 when g = 1. A theorem of Liouville in this terminology says that 
there is no meromorphic function on the torus having just one simple pole and no 
other poles. The final corollaries abstract this result to our setting, but they need 
an additional hypothesis to ensure that f,, = 1. Certainly f,, will equal | if k is 
algebraically closed. We consider g = 1 and g > 1 separately. These corollaries 
will be generalized in Problems 23-25 at the end of the chapter. 


Corollary 9.22. If k is algebraically closed, if vp is in Vg, and if g = 1, then 
every x in F with (x)g < ug is a scalar multiple of the identity. 


PROOF. Put A = vo. We seek x € F with vo(x) > —1 = —ord,, A and 
with v(x) > 0 = — ord, A for all other v. Thus we seek x in L(A). This A 
has deg A = 1 = g = 2g — 1. By Corollary 9.19, €(A) = degA+ (1—g) = 
1+ (1-1) = 1. Since L(A) already contains the multiples of the identity, it 
contains nothing else. 


The particular torus is C/A, where A is the lattice of periods. 


554 IX. The Number Theory of Algebraic Curves 


Corollary 9.23. If k is algebraically closed, if vp is in Vp, and if g > 1, then 
every x in F with (x)g < ug is a scalar multiple of the identity. 


PROOF. We argue by contradiction. Suppose that x is a nonscalar element in 
L(vo). Take r = 2g — 1, and let c),..., c, be distinct members of k. For each j 
with | < j <r,x —c; isin L(vg). Since deg(x — c;) = 0, there exists a unique 
vj € Ve with vj(x —c;) = 1. The divisor of the element (x —c)"! is then vp — v;. 
It follows that every k linear combination of the elements (x — c 7 iy lies in L(A) 
for A = v1 +---+v,;. On the other hand, these elements are linearly independent 
because v;()-;_, ai(x — cj)~!) < O if and only if a; 4 0. Thus £(A) > 2g — 1 
and deg A = 2g—1. Sincedeg A > 2g—2, Corollary 9.19 is applicable and gives 
£(A) = deg A+1—g. Thus2g—1 < €(A) = deg A+-1—g = 2g—1+1-g = 8, 
and we obtain the contradiction g < 1. 


6. Problems 


1. Let F bea function field in one variable over the field k, and let k’ be the subfield 
of all members of F that are algebraic over k. 
(a) Suppose that ft), ..., t, are members of k’ that are linearly independent over 
k, and suppose that x € F is transcendental over k. Prove that t),..., tf) are 
linearly independent over k(x). 
(b) Deduce from (a) that [k’ : k] < [k’(x) : k(x)]. 
(c) Deduce that [k’ : k] < co. 


Problems 2-4 concern perfect fields, which were defined in Section VII.3. The field 
k is perfect if either it has characteristic 0 or else it has characteristic p and the field 
map x +> x? of k into itself is onto. 


2. Prove that an algebraic extension of a perfect field is perfect. 


3. When k is perfect, refine an argument in Section | by making use of Theorems 
7.18, 7.20, 7.22, and the Theorem of the Primitive Element, and show that any 
function field in one variable is the function field of some affine plane curve 
irreducible over k. 

4. Let k be a perfect field. An affine plane curve f(X, Y) irreducible over k is 
nonsingular at a point (a, b) of its zero locus if at least one of of (a, b) and 
ae (a, b) is nonzero. Using Bezout’s Theorem and taking a cue from the proof of 
Theorem 7.20, prove that the curve can be singular at only finitely many points 
of its zero locus. 


Problems 5-11 seek to attach a discrete valuation of the function field of an irreducible 
affine plane curve to each point of the zero locus at which the curve is nonsingular. 
Let k be a base field, let f(X, Y) be an irreducible polynomial in k[X, Y], let R = 


6. Problems 555 


k[X, Y]/(f (X, Y)), let x and y be the images of X and Y in R, and let F be the field 
of fractions of R. Suppose that (a, b) € k* has the property that f(a, b) = 0. The 
condition of nonsingularity of f at (a, b) is that one of ee and x be nonvanishing at 
(a, b), and it will be assumed that 2 (a, b) #0. Observe from Lemma 7.16 that if S is 
any integral domain, ifs isin S, andifc(X) isin S[X], thenc(X)—c(s) = (X—s)d(X) 
for some d(X) in S[X]. 


3: 


10. 


Let f\(X) be the member of k[X] defined as above to make f(X,b) = 
(X — a)fi(X). Using the fact that of (a, b) # 0, prove that fi(a) 4 O and 
therefore also that f| (x) 4 0. 


Let g(X, Y) be amember of k[X, Y] with g(x, y) 4 0. Prove that if g(a, b) = 0, 
then there exist g;(X) in kLX] and h,(X, Y) in kLX, Y] with 


8(X, VY) fi(X) — F(X, Vg (X) = (Y — b)hi(X, Y), 


and deduce that g(x, y) = (y — b)hi (x, y)/fi(). 
Show that there is a discrete valuation vj of F over k with vj(y — b) > 0. 


If h(a, b) = 0 in Problem 6, then the process can be repeated to give 


g(x, y) = (y — b)*ha(x, y)/filx)’. 


It can be repeated again if h2(a, b) = 0, and so on. By applying the valuation 
v; of the previous problem to g(a, y), show that there is an upper bound to the 
integers k > 0 such that a nonzero member g(x, y) in R can be written in the 
form g(x, y) = (y —b)‘hx(x, y)/fi(x)* for some Ay (x, y) in R. 


(a) Deduce that each nonzero g(x, y) in R is of the form 


a(x, y) = (y — b)"h(x, y)/fitx)” 


with n > 0, h(x, y) in R, and h(a, b) ¥ 0, and that the integer n and the 
member /(x, y) of R are uniquely determined by g(x, y). 

(b) Conclude that every nonzero member g(x, y) of the field of fractions F is 
of the form (y — b)"h1 (x, y)/ho(x, y) with n in Z, hy (x, y) and ho(x, y) 
nonzero in R, hy (a, b) #0, and h2(a, b) 4 0. 

(c) Prove in (b) that g(x, y) uniquely determines n. 


Write each nonzero g(x, y) in F as in (b) of the previous problem, and put 
v(g) = n. Also, define v(0) = oo. Show that the resulting function v is a 
well-defined valuation of F having R in its valuation ring, taking the value 0 on 
all members of R that are nonvanishing at (a, b), and having all members of R 
vanishing at (a, b) in its valuation ideal. 


556 IX. The Number Theory of Algebraic Curves 


11. Prove that there is only one valuation of F over k taking the value 0 on all members 
of R that are nonvanishing at (a, b) and having all members of R vanishing at 
(a, b) in its valuation ideal. 


Problems 12—20 compute the genus of certain function fields in one variable. Let k 
be a field of characteristic ~ 2, let f(X) be a square-free nonconstant polynomial in 
k[X], let F = k(X)[Y]/(Y? — f(X)), and let x and y be the images of X and Y in F. 
In these problems, p denotes a positive integer. 
12. Verify that 
(a) the element x is transcendental over k, y is algebraic over k(x) with y? = 
Ff (x), and F is a function field in one variable over k, 
(b) every member of F is uniquely of the form a(x) + yb(x) with a(x) and b(x) 
in k(x), 
(c) every member of F not in k is transcendental over k, 
(d) F/k(x) is a Galois extension of degree 2, and the nontrivial element o of 
Gal(F/k(x)) satisfies o (a(x) + yb(x)) = a(x) — yb(x) for a(x) and b(x) 
in k(x). 


13. Prove that the integral closure of k[x] in F is the ring R of all elements 
a(x) + yb(x) such that a(x) and b(x) are in k(x). 


14. (a) Deduce from the previous problem that R is the set of all members z of F 
such that v(z) > 0 for all v in Dg that satisfy v(x) > 0. 
(b) Deduce from (a) that L(p(x)oo) C R. 


15. Let v be any member of Dp with v(x) < 0. 
(a) Prove that every nonzero c(x) in k[x] has v(c(x)) = (degc)u(x). 
(b) Prove that v(y) = 5 (deg f)v(x). 
(c) Prove that if a(x) and b(x) are in k[x] with deg b + 5 deg f < p and 
dega < p, then v(a(x) + yb(x)) = pv(x). 


16. Prove that if a(x) and b(x) are in k[x] with deg b + 5 deg f < panddega < p, 
then a(x) + yb(x) lies in L(p(%)oo). 


17. (a) Prove that if v is in Dg and if o is in Gal(F/k(x)), then the function v® 
defined by v? (z) = v(o(z)) for z € F is in Dp. 
(b) Why is v(x) < 0 if and only if v’ (x) < 0? 
(c) Deduce that if z is in L(p(X)oo), then so is o(z). 


18. (a) Using the previous problem, show that if a(x) and b(x) are in k[x] with 
a(x) + yb(x) in L(p(x)oo) and if v is a member of DF with v(x) < 0, 
then v(a(x)) > pu(x) and v(a(x)? — f (x)b(x)*) = 2pu(x). Conclude that 
dega < p and deg(a? — fb*) < 2p. 

(b) Deduce that L( p(x) oo) consists of all members a(x) + yb(x) of R such that 
dega < panddegb+ 5 deg f < p. 


6. Problems 557 


19. Calculate that £(p(x)oo) = 2p +2 —[5(1 + deg f)] if p => [51 + deg f)]. 
Here [- ] denotes the greatest integer function. 
20. (a) Why is deg(x)oo = 2? 
(b) Using Corollary 9.19 with A = p(x)oo for a suitable p, prove that the genus 
of Fis g =[5(1 + deg f)] — 1. 
Problems 21—22 compute the genus of certain further function fields in one variable. 
The notation is as in Problems 12-20 except that f(X) is allowed to have repeated 
factors. Suppose that f(X) = g(X n(x ), where h(X) is a square-free nonconstant 
polynomial and g(X) is in k[X]. Let F = k(X)[Y]/(Y? — f(X)). 
21. With F’ = k(X)[Z]/(Z* — h(X)), exhibit a field isomorphism F — F’ fixing k. 
22. Suppose that f(X) has degree 3. 
(a) Prove that F has genus 1 if f(X) has no repeated root in k and that F has 
genus 0 otherwise. 
(b) Prove that the affine plane curve Y — f (X) over k has a singularity in kee if 
and only if f(X) has a repeated root in kee: Here kag denotes an algebraic 
closure of k. 


Problems 23-25 introduce Weierstrass points. Let k be an algebraically closed field, 
and let F be a function field in one variable over k of genus g. Fix a discrete valuation 
vin Dg. 

23. Why is it true that 2(0v) = 1, 2(.v) = lif g > 1, (2g — 1)v) = g, £Qgv) = 
gtl,and €(nv) < €(m + Iv) < (nv) + I for all integers n > 0? 

24. Deduce from the previous problem that there exist exactly g integers 0 < nj < 
nz < +++ < Mg < 2g such that there is no x in F with (%)o9 = nju. (Educational 
note: The integers n; are called the Weierstrass gaps of v, and (11, ..., 1g) is 
the gap sequence for v. Classically when F is viewed as the function field of 
an everywhere nonsingular projective curve, then the points of the zero locus in 
projective space are in one-one correspondence with the members of Dg; with 
this understanding, the point corresponding to v is called a Weierstrass point if 
the gap sequence for v is anything but (1, 2,..., g). Accordingly let us call v a 
Weierstrass valuation in this case.) 

25. Prove that 
(a) v is a Weierstrass valuation if and only if £(gv) > 1. 

(b) 1 is a Weierstrass gap if g > 0. 

(c) vis not a Weierstrass valuation if g = 0 or g = 1. 

(d) ifr and s are positive integers with sum < 2g that are not Weierstrass gaps 
at v, thenr + s is not a Weierstrass gap at v. 

(e) if2isnota Weierstrass gap at v, then the gap sequence is (1, 3,5,...,2g—1). 


CHAPTER X 


Methods of Algebraic Geometry 


Abstract. This chapter investigates the objects and mappings of algebraic geometry from a geo- 
metric point of view, making use especially of the algebraic tools of Chapter VII and of Sections 
7-10 of Chapter VIII. In Sections 1-12, k denotes a fixed algebraically closed field. 

Sections 1-6 establish the definitions and elementary properties of varieties, maps between 
varieties, and dimension, all over k. Sections 1—3 concern varieties and dimension. Affine algebraic 
sets, affine varieties, and the Zariski topology on affine space are introduced in Section 1, and 
projective algebraic sets and projective varieties are introduced in Section 3. Section 2 defines 
the geometric dimension of an affine algebraic set, relating the notion to Krull dimension and 
transcendence degree. The actual context of Section 2 is a Noetherian topological space, the Zariski 
topology on affine space being an example. In such a space every closed subset is the finite union of 
irreducible closed subsets, and the union can be written in a certain way that makes the decomposition 
unique. Every nonempty closed set has a meaningful geometric dimension. In affine space the 
irreducible closed sets are the varieties, and each variety acquires a geometric dimension. The 
discussion in Section 2 applies in the context of projective space as well, and thus each projective 
variety acquires a geometric dimension. Moreover, any nonempty open subset of a Noetherian 
space is Noetherian. A nonempty open subset of an affine variety is called quasi-affine, and a 
nonempty open subset of a projective variety is called quasiprojective. Each quasi-affine variety or 
quasiprojective variety has a dimension equal to that of its closure, which is a variety. 

Sections 4-6 take up maps between varieties. Section 4 introduces spaces of scalar-valued 
functions on quasiprojective varieties — rational functions, functions regular at a point, and functions 
regular on an open set. The section goes on to relate these notions for the different kinds of varieties. 
Section 5 introduces morphisms, which are a restricted kind of function between varieties. The 
tools of Sections 4-5 together show that for many purposes all the different kinds of varieties can be 
treated as quasiprojective varieties. Section 6 introduces rational maps between varieties; these are 
not everywhere-defined functions, but each can be restricted to an open dense subset on which it is 
a morphism. Rational maps with dense image correspond to field mappings of the fields of rational 
functions, with the order of the mappings reversed. 

Section 7 concerns singularities at points of varieties, still over the field k. Zariski’s Theorem 
was stated in Chapter VII for affine varieties and partly proved at that time. In the current context 
it has a meaning for any point of any quasiprojective variety. The section proves the full theorem, 
which characterizes singular points in a way that shows they remain singular under isomorphisms 
of varieties. 

Section 8 concerns classification questions over k for irreducible curves, i.e., quasiprojective 
varieties of dimension 1. From Section 6 it is known that two irreducible curves are equivalent under 
rational maps if and only if their fields of rational functions are isomorphic. The main theorem of 
Section 8 is that each such equivalence class of irreducible curves contains an everywhere nonsingular 
projective curve, and this curve is unique up to isomorphism of varieties. The points of this curve 
are parametrized by those discrete valuations of the underlying function field that are defined over k. 


558 


1. Affine Algebraic Sets and Affine Varieties 559 


Sections 9-12 relate the general theory of Sections 1-6 to the topic of solutions of simultaneous 
solutions of polynomial equations, as treated at length in Chapter VIII. Section 9 treats monomial 
ideals in k[X1,..., X;,], identifying their zero loci concretely and computing their dimension. The 
section goes on to introduce the affine Hilbert function of this ideal, which measures the proportion of 
polynomials of degree < s not in the ideal. In the way that this function is defined, it is a polynomial 
for large s called the affine Hilbert polynomial of the ideal. Its degree equals the dimension of the 
zero locus of the ideal. Section 10 extends this theory from monomial ideals to all ideals, again 
concretely computing the dimension of the zero loci, obtaining an affine Hilbert polynomial, and 
showing that its degree equals the dimension of the zero locus of the ideal. Section 11 adapts the 
theory to homogeneous ideals and projective algebraic sets by making use of the cone in affine 
space over the set in projective space. Section 12 applies the theory of Section 11 to address the 
question how the dimension of a projective algebraic set is cut down when the set is intersected with 
a projective hypersurface. A consequence of the theory is the result that a homogeneous system of 
polynomial equations over an algebraically closed field with more unknowns than equations has a 
nonzero solution. 

Section 13 is a brief introduction to the theory of schemes, which extends the theory of varieties 
by replacing the underlying algebraically closed field by an arbitrary commutative ring with identity. 


1. Affine Algebraic Sets and Affine Varieties 


We come now to the more geometric side of algebraic geometry. At least initially 
this means that we are interested in the set of simultaneous solutions of a system 
of polynomial equations in several variables. Because of the Nullstellensatz the 
natural starting point for the investigation is the case that the underlying field of 
coefficients is algebraically closed. 

Accordingly, throughout Sections 1-6 of this chapter, k will denote an alge- 
braically closed field.' We fix a positive integer n and denote by A the polynomial 
ring A = k[X,,..., X,]. Typical ideals of A will be denoted by a, b,.... We 
begin by expanding on some definitions made in Section VIII.2. The set 


A’ = Viste oe) e k"} 


is called affine n-space. Members of A” are called points in affine n-space, and 
the functions P +> x;(P) give the coordinates of the points. 

To each subset S' of polynomials in A, we associate the locus of common 
zeros, or zero locus of the members of S: 


V(S)={P eA" | f(P) =O forall f € SI. 


Any such set V(S) is called an affine algebraic set in A”. If S is a finite set 
{fi,.--, fg} of polynomials, we allow ourselves to abbreviate V({fi,..., fx}) 


'The exposition in these sections is based in part on Chapters 2, 4, and 6 of Fulton’s book, 
Chapter I of Hartshorne’s book, and Chapter I of Volume 1 of Shafarevich’s books. 


560 X. Methods of Algebraic Geometry 


as V(fi,.--, fx). It is immediate from the definitions that V(S) is the same as 
V (a) if ais the ideal in A generated by S. The Hilbert Basis Theorem shows that 
every ideal of A is finitely generated, and it follows that every affine algebraic set 
is of the form V(fi,..., fx) for some k and some polynomials fi,..., fx. 

In Chapter VII we worked extensively with examples of ideals of A and their 
corresponding affine algebraic sets, and it will not be necessary to give further 
examples of that kind now. 

Observe from the definition that V(S) = ‘al; <s V(f) for any subset S of A. It 
follows immediately that S +> V(S), as a function carrying each subset S of A 
to a subset V(S) of A”, is inclusion reversing: S$; C Sz implies V(S,) > V(S). 
Using this same identity, we obtain the following further properties of V. 


Proposition 10.1. Affine algebraic sets in A” have the following properties: 
(a) V(@) = V(0O) = A" and V(A) = ©, 
(b) v( as Sy) =(), V(Sq) if the S,’s are arbitrary subsets of A, 
(c) V(S) = V(S,) U V(S2) if S; and S are subsets of A and if S is defined 
as the set of all products fi fo with fi € S; and fo € So. 


PROOF. Property (a) is immediate. For (b), we have 


V(USe) = N VA=N VN =NVSe). 


EU y Su a fEeSy 


For (c), we observe first that V( fi fo) = V( fi) U VCfo) for any f; and fo in A. 
Then 


VS=  VAAW=N NA VADYUVW)) 


hee fies frESo 
=(f VD) UCN Vip) = VS) U VS). 
fies) frES2 


Properties (a), (b), and (c) in the proposition are the axioms for the closed 
sets in a topology on A”. This topology is called the Zariski topology on affine 
n-space. Every one-point set is closed. The Zariski topology on A” is never 
Hausdorff; for example, if n = 1, then it is the topology on k! = k in which the 
nonempty open sets are the complements of the finite sets. Since one-point sets 
are closed and the topology is not Hausdorff, the Zariski topology on A” is never 
regular. At first glance it looks like a useless topology, but we shall see already 
in Proposition 10.3b and again in Section 2 that it is quite helpful for handling 
the bookkeeping used in passing back and forth between algebra and geometry. 

Next we introduce a function EF +» J (E), carrying each subset E of A” to an 
ideal [(E) in A, by the definition 


I(E)={f €A| f(P) =0 forall P € E}. 


1. Affine Algebraic Sets and Affine Varieties 561 


Then /(E) = ( )peg J ({P}). It follows immediately that E +> J (£) is inclusion 
reversing: E,; C E> implies /(£,) > [(E2). The result for /(-) that parallels 
Proposition 10.1 is as follows. 


Proposition 10.2. For fixed n, the function /(-) has the following properties: 
(a) 1(@) = Aand J(A) = 0, 
(b) [(E, U Eo) = 1(£,) OI (£2) if E; and E, are subsets of A”, 
(c) [(E, NE) D I(E)) + I(E2) if E; and E> are subsets of A”. 


REMARKS. Equality can fail in (c). For example, if E, is the one-point set {0} 


and E> is its complement, then /(£,; M Ex) = 1(@) = A, while [(E2) = 0 and 
I(£;) consists of all members of A with 0 constant term. 
PROOF. Property (a) is immediate. For (b), we have 
I(E,\UEx)= () IP) =( 1M I4P))N( MN IAP)) = T(E: OLED). 
PeE,VUE2 Pek, Pek) 


In (c), the fact that 7(-) is inclusion reversing implies that /(E, 9 E2) > I(E;) 
and that 7(E£,; N E2) D I(E2). Since I(E, N E2) is closed under addition, (c) 
follows. 


This is all quite elementary. The less trivial question is the extent to which 
V(-) and /(- ) are inverse to one another. Proposition 10.3 gives the answer. 


Proposition 10.3. For fixed n, 
(a) I(V(a)) = Ja for each ideal a in A, 
(b) V(I(E)) = E for each subset E of A”, where E is the Zariski closure 
of E, 
(c) V(a) = V(/a) for each ideal a in A, 
(d) any two ideals a and 6 in A have ab C aM 6 € Vab and consequently 
have V(aN b) = V(ab) = V(a) U V(b). 


REMARKS. Recall from Section VII.1 that ./a denotes the radical of a, con- 
sisting of all f in A such that f* is in a for some integer k > 1. The radical of a 
equals a itself if a is prime. 


PROOF. Conclusion (a) is the Nullstellensatz as formulated in Theorem 7. 1b. 

For (b), the definitions show that V(/(E)) > E. Since any set V (S) is Zariski 
closed, we must have V(I/(E)) > E. On the other hand, the fact that E is closed 
means that E = V(S) for some S. Thus V(S) = E D E, and the inclusion- 
reversing property of /(-) gives 1(V(S)) C I(E). Since the definitions imply 
that S C I(V(S)), we obtain S C /(£). From the inclusion-reversing property 
of V(-), we conclude that E = V(S) D> V(I(E)). 

For (c), (a) and (b) give V(./a) = V(I(V(a))) = V(a) = V(a) because V (a) 
is closed. 


562 X. Methods of Algebraic Geometry 


For (d), the inclusion ab C a 6 is immediate. If f is inaM, then f is 
in a and in 6, and hence f? is in ab. Thus f is in /ab. Applying V(-) gives 
V(ab) > V(anb) D V(Vab). Since V(ab) = V(/ab) by (c), Vian b) = 
V(ab). Finally V(ab) = V(a) U V(b) by Proposition 10.1c. 


An affine variety is any affine algebraic set of the form V(p), where p is a 
prime ideal? of A. That is, an affine variety is the locus of common zeros of any 
prime ideal of A. 

For example, if f is an irreducible polynomial in A, then f is prime because A 
is aunique factorization domain, and consequently the principal ideal ( f) is prime. 
Thus the zero locus in A? of an irreducible polynomial f in k[X, Y] is an example 
of an affine variety. This particular kind of affine variety is called an irreducible 
affine plane curve.*’* More generally, if f is irreducible in A = k[X1,..., Xn] 
with n > 2, then the zero locus of f in A” is called an irreducible affine 
hypersurface.” Another example of an affine variety is any translate of any vector 
subspace of A”. Examples of affine varieties other than irreducible hypersurfaces, 
translates of vector subspaces, and varieties built from other varieties in simple 
ways often take some work to establish. The reason is that it is usually not easy 
to show that a particular nonprincipal ideal is prime. Here is one example that is 
manageable. 


EXAMPLE. The twisted cubic in A? is the zero locus V(p) of the ideal p in 
k[X, Y, Z] given by p = (Y — X?, Z— X°); that is, V(p) = {(x, x7, x3) | x © k}. 
The substitution homomorphism ¢ that fixes k and sends X to X, Y to X 2 and 
Z to X? carries k[X, Y, Z] into k[X]. It is onto kLX] because any polynomial in 
X alone is sent to itself by g. The kernel of g manifestly contains p. To see that 
it equals p, we argue by contradiction. Choose a polynomial f in ker @ not in p 
whose degree in Z is as small as possible and whose degree in Y is as small as 
possible among those of minimal degree in Z. If Z occurs somewhere in f, then 
by replacing all occurrences of Z in f with X?, we replace f by another member 
of f +p of lower degree in Z, contradiction. Thus f has no Z in it. Arguing 


?Warning: The books by Fulton and Hartshorne in the Selected References use the narrow 
definition of variety that is reproduced here. Some books by other authors allow all affine algebraic 
sets to be called varieties. Volume | of Shafarevich’s books does not use the word “variety.” 

3 Warning: This definition represents a change from Chapters VIII and IX, corresponding to a 
change in point of view. Previously the word “curve” referred to the ideal, and now it is to refer to 
the zero locus. From a mathematical standpoint Proposition 10.3 shows that this distinction is not 
important in the presence of the irreducibility and the fact that k is algebraically closed. The change 
thus represents only a matter of convenience for the exposition. 

4Some authors build the condition of irreducibility into the definition of “curve,” but this book 
does not. 

>Some authors build the condition of irreducibility into the definition of “hypersurface,” but this 
book does not. 


2. Geometric Dimension 563 


similarly, we see that f has no Y init. So f is a polynomial in X. Since @ acts 
as the identity on polynomials in X alone, f = 0. This contradiction shows that 
kerg = p. Since imageg = k[X] is an integral domain, p is prime. By the 
Nullstellensatz, p may be described alternatively as the ideal of all polynomials 
vanishing on V (p). 


Every affine variety is nonempty, as a consequence of the Nullstellensatz. In 
fact, any prime ideal p of A is contained in a maximal ideal m, whose zero locus 
is identified as some point P of A”. The inclusion p C m implies that V(p) > 
V(m) = {P}. Affine varieties are characterized by a geometric irreducibility 
property that is stated in Corollary 10.4. 


Corollary 10.4. The affine varieties in A” are characterized as those nonempty 
Zariski closed sets that cannot be written as the union of two proper closed subsets. 


REMARKS. One says that the affine varieties are those affine algebraic sets that 
are irreducible. Irreducible sets are nonempty by definition. 


PROOF. Let V (p) be an affine variety with p prime, and suppose that V(p) = 
E, UE) with FE, and E> both closed and properly contained in V(p). Application 
of /(-) and use of Proposition 10.2b gives J(V (p)) = [(£1)N1(E2). Proposition 
10.3a allows us to rewrite this conclusion as p = 6; Mb with b; = /(£}) and 
by = I(E2). By Problem 10a at the end of Chapter VII, p = 6; or p = bp. If 
p = b,, then V(p) = V(b,) = VU (E))), and this equals E; by Proposition 
10.3b because E, is closed. Similarly if p = 62, then V(p) = E>. Thus E, and 
E cannot both be proper subsets of V (p). 

Conversely suppose that E is an irreducible closed subset of A”. Let f and 
g be members of A with fg in /(£). Then Propositions 10.3b and 10.1c give 
E=V(I(E)) € V(fg) = V(f) U V(g). Therefore 

E=(ENV(f)) U(EN V¢g)) 
exhibits E as the union of two closed sets. By irreducibility one of the two closed 
sets equals EF. If E = EN V(f), then E C V(f) and [(E) > I(V(f)) 2 (Cf). 


If E = EN V(g), then similarly 7(E) > (g). Either way, one of f and g lies in 
I(E). Since E is assumed nonempty, /(£) is proper. Therefore /(£) is prime. 


2. Geometric Dimension 


We continue to assume that k is an algebraically closed field and to write A 
for k[X1,..., Xn]. If p is a prime ideal in A, then the dimension of the affine 
variety V(p) was defined in Section VII.2 to be the transcendence degree of the 
field of fractions of the integral domain A/p over k. This quantity depends only 


564 X. Methods of Algebraic Geometry 


on V(p) because p can be recovered from V(p) by the formula p = /(V(p)) 
given in Proposition 10.3a. The integral domain A/p is finitely generated as a 
k algebra with generators X; + p,..., X, +, and Theorem 7.22 shows that 
this transcendence degree equals the Krull dimension of the ring A/p, which is 
denoted by dim A/p. The latter quantity is the supremum of the indices d of all 
strictly increasing chains po S pi S --- S pa of prime ideals in A/p. 

Because of this equality, it is natural to use the notion of Krull dimension in 
order to generalize the definition of dimension from varieties to all nonempty 
affine algebraic sets.° If a is an any proper ideal in A, not necessarily prime, and 
V (a) is its locus of common zeros, we might first try defining dim V (a) to be the 
Krull dimension of A/a. This approach is a bit cumbersome because two distinct 
ideals a and a’ can have V(a) = V(a’); thus some argument would be needed to 
see that dim V (a) is well defined before it would be possible to proceed. 

Instead, we shall give a direct geometric definition of dimension in terms of 
the Zariski topology on A”. Theorem 10.7 later in this section will show that the 
geometric quantity dim V (a) equals the Krull dimension of A / ./a, thus that the 
dimension of an affine algebraic set has an algebraic formulation. From this result 
we shall deduce that dim V (a) equals the Krull dimension of A/a itself. This 
algebraic formulation of a definition will not yet allow us to compute dimensions 
concretely, but we shall introduce in Sections 9-11 an equivalent combinatorial 
definition of dimension that is computable in terms of Grébner bases. 

A topological space X will be said to be Noetherian if every strictly decreasing 
sequence of closed subsets is finite in length. An example is affine n-space A”. 


In fact, if FE}, E2,... are closed sets in A” with FE; D E> D ---, then the 
corresponding ideals have /(F£,) C I(E2) C ---. Since A is Noetherian, there 
exists some integer k with J(F;,) = [(Ex41) = ---. Applying V(-) and using 


Proposition 10.3b, we obtain E; = Ex; =---. 

We can generalize the definition of irreducibility for closed sets from A” to 
an arbitrary Noetherian topological space. Namely a nonempty closed set E is 
irreducible if it is not the union of two proper closed subsets. An important ob- 
servation about any Noetherian topological space is that any nonempty relatively 
open subset U of an irreducible closed set V is dense in V; in fact, if U denotes 
the closure of U, then V = U U(V — U) exhibits V as the union of two closed 
subsets, and the irreducibility forces U = V since V-—U 4 V. 


Proposition 10.5. If X is a Noetherian topological space, then any closed 
subset is the finite union of irreducible closed subsets. This decomposition of a 
closed set as such a union may be chosen in such a way that none of the closed sets 
in the union contains another set in the union, and in this case the decomposition 
is unique. 


We shall leave the dimension of the empty set as undefined for now. 


2. Geometric Dimension 565 


PROOF. For existence of some decomposition of each closed set as a finite 
union of irreducible closed subsets, we argue by contradiction. Assuming that 
there exists some closed subset E of X that is not the finite union of irreducible 
closed subsets, we may assume by the Noetherian condition on X that F is minimal 
among all such counterexamples. Since E cannot itself be irreducible, we can 
write E = EF; U E> with EF; and E> closed and properly contained in FE. Since 
E is minimal among all closed subsets that are not the finite union of irreducible 
closed subsets, EF; and E> can be expressed as finite unions of irreducible closed 
subsets. Substituting these expressions into the equality E = E, U E» gives a 
contradiction to the fact that E is a counterexample. 

This proves existence of a decomposition. By going through the sets in the 
decomposition one at a time and by discarding any set that is contained in another 
set, we obtain a decomposition as in the second sentence of the proposition. 

For uniqueness, suppose that EF = FE, U---U Ex = Fi U--- U F gives two 
decompositions of the asserted kind. Say thatk > /. Since F; C E,U---U---UEx, 
we obtain F; = (F) N £1) U---U (FM Ex). Irreducibility of F; implies that 
F; = F, 1 Ejq) for some j = j(i). Hence F; C Ejq) for some function j (i) 
from {1,...,/} to {1,...,k}. Reversing the roles of the F;’s and the F;’s yields 
a function i(j) such that FE; C Fjj). Then F; C Ejay © Ficj@y. Since no F; 
contains some Fj with i’ 4 i, we conclude that i(j(i)) =i for all i. Therefore 
k =I, and i(-) and j(-) are inverse to each other. 


Corollary 10.6. Every affine algebraic set in A” can be expressed uniquely as 
the finite (possibly empty) union of affine varieties in such a way that none of the 
varieties contains another of the varieties. 


REMARKS. For example, 
V(X? —Y?) = V(X + Y)UV(X —Y) 


by Proposition 10.1c, and the affine algebraic set on the left side is expressed as 
the union of the affine varieties on the right. 


PROOF. We saw before Proposition 10.5 that A” is a Noetherian topological 
space, and Corollary 10.4 shows that the irreducible subsets are the affine varieties. 
The closed sets are the affine algebraic sets by definition, and hence the result is 
a special case of Proposition 10.5. 


The geometric dimension of a nonempty closed subset E of a Noetherian 
topological space X is the supremum of the integers d > 0 such that there exists 
a strictly increasing chain Ey & E; G --- & Eq of irreducible closed subsets 
of FE. This definition makes sense because a chain with d = 0 can always be 
formed with Ep equal to one of the irreducible closed sets from Proposition 10.5; 


566 X. Methods of Algebraic Geometry 


however, there is no guarantee in this generality that the geometric dimension 
will be finite. In any event, it is clear from the definition that if two closed sets 
E and E’ have E C E’, then the geometric dimension of EF is < the geometric 
dimension of E’. 

In the case of a nonempty affine algebraic set V(S), the geometric dimension 
of V(S) is to refer to this kind of dimension relative to the Zariski topology. 


EXAMPLES OF GEOMETRIC DIMENSION IN A”. 

(1) Any one-point set in A” is closed and plainly has geometric dimension 0. 
Any affine variety V with more than one point has geometric dimension > 1, 
since {P} & V is a strictly increasing chain of irreducible closed sets if P is 
chosen as a point in V. 

(2) A” has geometric dimension n. This fact will follow from Theorem 10.7 
below because A has Krull dimension n as a consequence of Theorem 7.22. 

(3) Twisted cubic in A?, namely {(x,x?,x°) | x € k}. According to the 
example in Section 1, this is V(p) for the prime ideal p = (Y — be A CARS 
k[X, Y, Z]. The inclusions of prime ideals (X, Y, Z) 2 (Y — X?, Z — X*) 2 
(Y¥ =X?) 2 0 give the strictly increasing chain {0} S V(p) S {(x, x*, z)} S A}, 
which is of the kind described for A>. If another term could be included between 
{0} and V(p), then we would obtain a sequence showing that A? has geometric 
dimension > 4, in contradiction to Example 2. So V (p) has geometric dimension 
< 1. In view of Example 1, V(p) has geometric dimension equal to 1. 


Theorem 10.7. If a is any proper ideal of A, then the following four quantities 
are equal: 
(a) the geometric dimension of V (a), 
(b) the Krull dimension of A/./a, 
(c) the maximum of the geometric dimension of V; over all affine varieties 
V; contained in V (a), 
(d) the Krull dimension of A/a. 


REMARKS. We take these equal quantities as the definition of the dimension 
dim V(a) of the affine algebraic set V(a). Because of Theorem 7.22, these 
quantities equal the transcendence degree over k of the field of fractions of A/a 
in the case that a is a prime ideal. For a = 0, we know that dim A = n; hence 
the equal quantities in the theorem are < n. 


PROOF. Let 
Eo CE, C++: C Eg (*) 


be an increasing chain of irreducible closed subsets of V (a), and define p; to be 
the ideal p; = /(£;). Then each p; is a prime ideal by Corollary 10.4, and also 


Pa S--- Spr S po (2) 


2. Geometric Dimension 567 


because / (- ) is inclusion reversing. If () is strictly increasing, then so is (**); in 
fact, if p; were to equal p;_; for some j, then we would have FE; = V(I(E;)) = 
Vipj) = Vipj-1) = VU (E;-1)) = Ej-1, contradiction. In (*), we have Ey © 
V (a), and thus Proposition 10.3a gives ./a = I(V(a)) C I (Eg) = pa. In other 
words, any strictly increasing sequence («) of irreducible closed subsets of V (a) 
yields a strictly increasing sequence (*«*) of prime ideals of A that contain /a. 

Conversely if (**) is a strictly increasing sequence of prime ideals of A con- 
taining ./a, and if we define E; = V(p;) for 0 < j < d, then we obtain the 
sequence («) of irreducible closed subsets of V(./a) = V(a), and (+) is strictly 
increasing, since an equality E; = Ej;—; would imply that pj = /(V(p;)) = 
I(E;) = I(Ej-1) = 1(V (pj-1)) = pj-1 because of Proposition 10.3a. 

Thus the strictly increasing sequences (*) of irreducible closed subsets of V (a) 
are in one-one correspondence with the strictly increasing sequences (+) of prime 
ideals of A containing ./a. Lety: A> A if a/a be the quotient homomorphism. 
Application of ¢ to (*) yields a strictly increasing sequence of ideals of A i Ja 
by the First Isomorphism Theorem, and prime ideals map to prime ideals under 
this correspondence. Thus the existence of a strictly increasing sequence as in 
(+k) implies that the Krull dimension of A i /a is > d. Meanwhile, the existence 
of a strictly increasing sequence as in (*) implies that the geometric dimension of 
V(a) is > d. We have seen that these sequences are in one-one correspondence, 
and therefore the equality of (a) and (b) in the theorem follows. 

In (c) certainly the geometric dimension of any V; is < the geometric dimension 
of V(a). If do denotes the geometric dimension of V (a), then we can find a strictly 
increasing chain as in («) with d = dp and with all the sets contained in V(a). 
Corollary 10.4 shows that Eg, is an affine variety contained in V(a), and the 
sequence (*) shows that the geometric dimension of Eq, is at least do. Thus 
V; = Eg, is an affine variety contained in V(a) whose geometric dimension 
equals that of V (a). 

To complete the proof, we show the equality of (b) and (d), i.e., we show that 
A/aand A f /a have the same Krull dimension. Since a C ./a, it is enough to 
show that in any strictly increasing sequence of prime ideals as in () such that 
all the ideals contain a, all the ideals actually contain ./a. (Then the sequences 
(>) for a will be in one-one correspondence with the sequences for ,/a , and we 
can argue using the First Isomorphism Theorem as in the third paragraph of the 
proof.) Thus let x be in ./a. By definition of radical, x* lies in a for some k. 
Since a C py, x* lies in pa. But pg is prime, and therefore x lies in pg. Thus 
every ideal in the sequence (**) for a occurs in the sequence (*«*) for ./a, and 
the theorem follows. 


The dimension of an irreducible hypersurface in A = k[X1,..., X,]J isn —1, 
as was observed in Section VII.5. Proposition 10.9 below will prove a converse. 


568 X. Methods of Algebraic Geometry 


Lemma 10.8. Every minimal nonzero prime ideal in A is principal. 


PROOF. Let p be a minimal nonzero prime ideal, let f 4 O be a nonzero 
member, and write f as the product of irreducible elements. Since p is prime, 
one of the irreducible elements, say g, lies in p. Since A is a unique factorization 
domain, g is prime. Consequently (g) is a prime ideal of A lying in p. By 
minimality of p, p = (g). 


Proposition 10.9. Suppose that p is a prime ideal of A and V(p) is the 
corresponding affine variety. If dim V(p) = n — 1, then p is principal, and hence 
V (p) is an irreducible hypersurface. 

PROOF. For any n > 1, dimV(p) =n —1 <n = dim V(O) implies p 4 0. 
Since dim V (p) = n — 1, there exists a chain 


O=qSunS-::S4n-1 


of prime ideals in A/p. If g : A — A/p denotes the quotient homomorphism, 
then this chain lifts to A as 


0SpSo'(1) S--- So 'Gn-v. 


This chain has n members after the O at the left, and A has Krull dimension n. 
Consequently the first nonzero element, which is p, is a minimal nonzero prime 
ideal of A. By Lemma 10.8, p is principal. 


A quasi-affine variety is any nonempty Zariski open subset of an affine variety. 
These sets and their projective analogs, which will be defined in Section 3, will be 
the main objects of interest geometrically in Sections 1-6. If Y is a quasi-affine 
variety, then the closure Y is the affine variety in question because any nonempty 
relatively open subset of an affine variety is dense in the variety.’ 

Let us see that the relative Zariski topology on a quasi-affine variety Y makes 
Y into a Noetherian topological space. In fact, if X is a Noetherian topological 
space and Y is a topological subspace, then Y is Noetherian. To see this, we 
argue by contradiction, letting FE; > E, > --- bea strictly decreasing sequence 
of relatively closed sets in Y. Then the sequence of closures in X forms a 
decreasing sequence of closed sets in X with the property that E; = YO E; for 
each j because E; is assumed to be relatively closed in Y. It follows that the 
sequence of closures is strictly decreasing, contradiction. 

Consequently any quasi-affine variety Y is Noetherian in the relative Zariski 
topology and has a meaningful geometric dimension. We write dim Y for this 
dimension. 


7This important observation was made just before Proposition 10.5. 


2. Geometric Dimension 569 


Lemma 10.10. If Y is a quasi-affine variety in A” and if E is a nonempty 
relatively closed subset of Y, then E is irreducible® for Y if and only if E is 
irreducible for A”. 


REMARKS. We shall actually prove the stronger result that if Y is a nonempty 
open subset of a Noetherian topological space X (such as A”) and if E is a 
nonempty relatively closed subset of Y, then E is irreducible for Y if and only if 
E is irreducible for X. This stronger result will be used in Section 3. 


PROOF. First we check that E reducible implies E reducible. If E is reducible, 
say is aunion E = EF, U E» with E£; and E, relatively closed proper subsets of 
E, then E = E; U Ep. Each of E; and E> is a closed subset of E. To see that 
E, is proper, we argue by contradiction. If E; = E, then intersecting both sides 
with Y gives the contradiction E; = YN E; = YN E = E because E; and E 
are both relatively closed. Similarly E> is proper, and thus E is reducible. 

Conversely suppose that E is reducible, say is a union E = F, U Fy with F, 
and F> closed in X and properly contained in E. Intersecting both sides with 
Y gives E=YOE=YO(F{|UP) = (YNF)) U(Y NF) because E is 
relatively closed. The sets YM F; and Y M F» are relatively closed, and their 
union is E.. To see that E is reducible, we argue by contradiction. If YN Fi = E, 
then E C F;. Since F; is closed in X, E © F. Thus F; is not a proper subset 
of E, contradiction. Similarly we cannot have YM Fy = E, and therefore E 
is exhibited as the union of the two proper relatively closed subsets Y M F; and 
YO Fo. 


Proposition 10.11. If Y is a quasi-affine variety in A”, then dim Y = dim Y. 
Here dim Y refers to the dimension of the affine variety Y in any of the senses of 
Theorem 10.7. 


REMARKS. This proposition is a formal consequence of Lemma 10.10. The 
stronger statement that we actually prove is that if Y is a nonempty open subset 
of a Noetherian topological space X, then the geometric dimension of Y as a 
Noetherian space equals the geometric dimension of X as a Noetherian space. 


PROOF. Let Eo C FE; C --- C Ey beastrictly increasing sequence of relatively 
closed irreducible subsets of Y. Then Ey C FE, C --- © Eg is an increasing 
sequence of closed subsets of A”, each of which is irreducible by Lemma 10.10. 
Since E; = YM E; for each j, the sets E; are strictly increasing. Since the given 
sequence of sets E; is arbitrary, it follows that dim Y < dim Y. 

For the reverse inequality, let Fo C F; C --- C Fy bea strictly increasing 
sequence of irreducible closed subsets of Y. If E; denotes Fj NY, then Ey C 


8... in the sense of not being the union of two relatively closed proper subsets. 


570 X. Methods of Algebraic Geometry 


E, © --- © Eg is an increasing sequence of relatively closed subsets of Y, 
each of which is irreducible by Lemma 10.10. Since Fj = = E, j, the sets Ej are 
strictly increasing. Since the given sequence of sets Fj is an, it follows that 
dimY < dimY. 


3. Projective Algebraic Sets and Projective Varieties 


We continue to assume that k is an algebraically closed field and to write A 
for k[X1,..., Xn]. In Section VIII.3 we studied the projective analogs of affine 
plane curves, and the task for the present section is to study similarly the projective 
analogs of general affine algebraic sets, affine varieties, and quasi-affine varieties. 
As in Section VIII.3, projective n-space over k is defined set theoretically as 
the quotient 
P" = {(x0,..-,%n) €k"t — (O}} / ~, 


WHE (Kpiess Xe) Op wins Ae) Apert) = AGGieso) &,) for some 
AE k*, We write [xo,..., Xn] for the class of (xo, ..., x,) in P”. 
Put A = k[Xo,..., X,]. The polynomials of interest for algebraic geometry 


relative to P” are the homogeneous polynomials in A. The definitions of “mono- 
mial,” “total degree” of a monomial, “homogeneous polynomial,” and “degree” of 
a homogeneous polynomial all appear in Section VIII.3; monomials are defined so 
as to have coefficient 1. By convention the 0 polynomial is homogeneous of every 
degree. We write Ag = k[Xo,..., XnJa for the k vector space of homogeneous 
polynomials of degree d. Each member F of Ag satisfies 


F (Axo, ..-,AX_) = ATF (Xo, »--5 Xn) 


for all (xo, ...,Xn) € k"*! anda € k*. Conversely the fact that the mapping of 
polynomials into polynomial functions is one-one for an infinite field implies that 
a member F of A is homogeneous of degree d if it satisfies the above displayed 
property. Four further properties of Ag from Section VIII.3 are that 


the zero locus of a member of Aa is well defined as a subset of P”, 

the monomials of total degree d form a k basis of the vector space Ay 
dim, Aa = Ce ‘ 
any polynomial factor of a homogeneous polynomial over a field k is 
homogeneous. 


An ideal a in A is called a homogeneous ideal if it is the vector-space 
sum over d > 0 of its intersections with Aa: a= Ci (an Aq). Any ideal 
in A that is generated by homogeneous polynomials is a homogeneous ideal. A 
special case of this fact is that if a k vector subspace ag of Ag is specified for each 


3. Projective Algebraic Sets and Projective Varieties 571 


integer d > 0, then a = Eo ag is a homogeneous ideal if and only if for each 
d > Oande > 0, the inclusion F Aa Cc Ans holds for each F in Ae. 

We can now imitate some of the development of Sections 1 and 2 for the 
present context as long as we stick to homogeneous polynomials in A and to 
homogeneous ideals. For any homogeneous polynomial F in A, the set 


Vib) =P =o-megtal P| P Goma) =0} 


is well defined by the first bulleted property above. Thus if S is any set of 
homogeneous elements in A, we can associate the locus of common zeros in P”, 
or zero locus, of the members of S by the formula 


V(S)= 1) V(F). 
FeS 


If a is a homogeneous ideal, then V(a) by convention means V(S), where S is 
the subset of all homogeneous members of a. Any such set V(S) is called a 
projective algebraic set in P”. The function S +> V(S) is inclusion reversing. 
The analog of Proposition 10.1 in the present context is that projective algebraic 
sets have the following properties: 
(i) V(@) = V0) = P" and V(A) = 9, 
(ii) v( (oe Sa) = (|, V(Sq) if the S,’s are arbitrary sets of homogeneous 
elements in A, 
(ili) V(S) = V(S)) U V(S2) if S; and S> are sets of homogeneous elements 
in A and if S is defined as the set of all products Fi F with F, € S, and 
Fy € S). 
Consequently the projective algebraic sets in P” form the closed sets for a topology 
on P” called the Zariski topology on P”. + 
Next we associate to each point P of P” a homogeneous ideal /(P) in A by 
the definition 


I(P)={Fe A | F(x0,---,%n) = 0 whenever [Xxo,..., Xn] = Ph. 


Problem | at the end of the chapter shows that /(P) is indeed a homogeneous 
ideal. In terms of the ideals J(P), we define 1(E) = ()p-,1(P) for each 
subset E of P". The result E +> /(£) is a function carrying subsets E of P” to 
homogeneous ideals /(E£) in A”. The function E +» /(£) is inclusion reversing, 
and the same argument as for Proposition 10.2 shows that for each n it satisfies 
(i) 1(@) = Aand /(P") =0, 
Gi) T(E, U Eo) = 1(E,) OI (E2) if E; and E> are subsets of P”, 
Git) T(E; Ez) D I(£,) + I(E2) if E; and E> are subsets of P”. 


572 X. Methods of Algebraic Geometry 


If S is any set of homogeneous elements in A and if V = V(S) is the 
corresponding projective algebraic set in P”, then we define the cone over V 
to be the subset of A”*! given by 


C(V) = (0,...,0) Uf (xo, ...,4n) € A”* | [x0,..., a] € Vi. 


This kind of set has the following two properties: 
(i) V nonempty implies that the ideals 7(C(V)) and /(V) in A are equal, 
(ii) any homogeneous ideal a in A with V(a) nonempty in P” has C(V(a)) 
equal to the subset V (a) in affine (7 + 1)-space. 
Use of this device reduces a number of questions about P” to questions about 
A"*!, An example is a projective analog of Proposition 10.3, which appears as 
the next proposition. 


Proposition 10.12. For fixed n, 

(a) (homogeneous Nullstellensatz) a homogeneous ideal a in A has V (a) 
empty in P” if and only if there is an integer N such that a contains A, 
fork > N, os 

(b) 1(V(a)) = a for each homogeneous ideal a in A for which V(a) is 
nonempty in P”, 

(c) V(UI(E)) = E for each subset E of P”, where E is the Zariski closure of 
EinP”. 


REMARK. For clarity in the proof, let us write V,(-) and V,(- ) to distinguish 
zero loci in A"*! from zero loci in P”. 


PROOF. For (a), V,(a) is empty in P” if and only if V,(a) is contained in 
{0} in A”*!, if and only if /a = I(V,(a)) contains (Xo,..., X,) by the affine 
Nullstellensatz. In this case if f;,..., f; are generators of ,/a, then the elements 
fi",..., f/" are in a for some m, and it follows that (Si Cr fiy lies in a for 
all scalars c; whenever k > rm, hence Ay C afork > rm. Conversely if /a 
fails to contain some X;, then xi is not in a for any k > 1, and Ay cannot be 
contained in a. 

For (b), Ip(Vp(a)) = La(C(Vp(a))) = Ia (Va(a)) = a/a by (i) of cones, (ii) of 
cones, and the affine Nullstellensatz. 

Conclusion (c) is proved by the same argument as for Proposition 10.3b. 


A projective variety is any nonempty? projective algebraic set of the form 
V(p), where p is a prime homogeneous ideal in A. If the ideal p is the principal 


°The prime homogeneous ideal p = (Xo,..., Xn) has V(p) = 2, but no other prime homoge- 
neous ideal q has V(q) = @. In order to avoid trivial counterexamples to some results, we shall 
often want to exclude this particular prime ideal p from consideration. 


3. Projective Algebraic Sets and Projective Varieties 573 


ideal generated by an irreducible homogeneous polynomial, then the ideal or the 
variety is called an irreducible projective hypersurface. !° 


Corollary 10.13. The projective varieties in P” are characterized as those 
nonempty Zariski closed sets that cannot be written as the union of two proper 
closed subsets. 


REMARK. Such a subset of P” is said to be irreducible. As in the affine case, 
irreducible sets are understood to be nonempty. 


PRooF. If V(p) is a projective variety, then the union of {0} and the subset 
of k"*! whose equivalence classes are in V(p) is an affine variety in A”*!. It is 
irreducible in A”*!, and this irreducibility in A”*! implies irreducibility within P”. 

Conversely if E is an irreducible closed subset of P" and if F and G are 
homogeneous members of A with FG in J (E£), then we can argue as in the proof 
of Corollary 10.4 to see that one of F and G lies in J (E) and that 7 (E) is proper. 
Since /(E) is a homogeneous ideal, this fact implies that /(£) is prime. 


Since A is a Noetherian ring, it follows that P” is a Noetherian topological 
space in the sense of Section 2. Consequently Proposition 10.5 is applicable. 
Combining this result with Corollary 10.13, we obtain the following corollary. 


Corollary 10.14. Every projective algebraic set in P” can be expressed 
uniquely as the finite (possibly empty) union of projective varieties in such a 
way that none of the varieties contains another of the varieties. 


Geometric dimension is therefore meaningful for nonempty projective alge- 
braic sets, and each such set in P” has geometric dimension < n. 

A quasiprojective variety is any nonempty Zariski open subset of a projective 
variety. Quasi-affine varieties and quasiprojective varieties will be the main 
objects of interest geometrically in Sections 1-7. If Y is a quasiprojective variety, 
then the relative Zariski topology on Y makes Y into a Noetherian topological 
space, just as in the quasi-affine case. Consequently Y has a meaningful geometric 
dimension. The arguments in Lemma 10.10 and Proposition 10.11 concerning 
quasi-affine varieties are arguments in point-set topology and valid proofs of facts 
about quasiprojective varieties. Therefore we obtain the following result. 


Proposition 10.15. If Y is a quasiprojective variety in P”, then the closure Y 
in the Zariski topology of P” is a projective variety, and the geometric dimensions 
of Y and Y are equal. 


!0 As in the affine case, as long as the assumption of irreducibility is in force, the distinction 
between the ideal and the variety is unimportant. 


574 X. Methods of Algebraic Geometry 


We can identify A” as a subset of P” by the formula 


Bo(xX1, ++, Xn) = [il Sit Xl 


for (41,...,%n) in A”. The complement of fo(A”) in P” is the zero locus 
of the homogeneous polynomial Xo, and consequently Bo(A”) is open in P”. 
Since the equality P’” = V(0) exhibits P” as a projective variety, Bo(A”) is a 
quasiprojective variety. We are going to show that Bo respects topologies in that 
the Zariski topology of A” is carried to the Zariski topology of the quasiprojective 
variety Bo(A”). To do so, we make use of the corresponding transpose mapping 
B, : A> Aon polynomials given by 6) F = f with 


F(X, ..., Xn) = F(Bo(X1,..., Xn)) = FC, X1,..., Xn). 


This is the substitution homomorphism that fixes k, fixes X1,..., Xn, and carries 
Xo to 1. Being an algebra homomorphism onto, 8) carries ideals of A to ideals 
of A. In particular, it carries homogeneous ideals of A to ideals of A. 


Lemma 10.16. If a is a homogeneous ideal in A and b = 8) (a) is its image 
under fj, then 8) carries the set of homogeneous elements of a onto b. 


Proor. Every member of 6 is the sum of the images under £j of finitely many 
homogeneous members of a. If Fi,..., F% are these homogeneous members, 
then it is enough to produce Gj, ..., G; in a all homogeneous of the same degree 
such that 6)(F;) = £)(G;) for all j. If di,..., dy are the respective degrees of 
F,..., Fy, and if d = max(d,,..., dy), then the elements G; = a F; have 
the required properties. 


Lemma 10.17. Let a be a homogeneous ideal of A, and let 6 be the ideal of A 
given by b = 6j(a). Then Bo(V(b)) = V(a) M Bo(A”). 


PROOF. If (x1,..., Xn) is in V(b) and if F is a homogeneous member of a, 
then f = £)(F) is in b withO = f(x, ..., xn) = F(Bo(1,...,%,)). Since F 
is arbitrary, Bo(x1,...,X,) isin V(a). Thus Bo(V(6)) C V(a) N Bo(A”). 

For the reverse inclusion, let [1, x,,...,x,] be in V(a) M Bo(A”). If f is 
in 6, find by Lemma 10.16 a homogeneous F in a with Bj)F = f. Since 
[1,x1,...,%,] is in V(a), FC, x1,...,%,) = 0. Therefore f(x1,...,%1) = 
F(Bo(1,.--,%n)) = FU, x1,...,Xn) = 0. Since f is arbitrary in 6, the point 
(x1,...,X,) is in V(b), and Bo(V(b)) D Va) N Bo(A”"). 


Proposition 10.18. Under the inclusion By : A” — P”, the Zariski topology 
of affine n-space A” coincides with the relative topology from P”. 


3. Projective Algebraic Sets and Projective Varieties 575 


ProoF. If we start from an affine algebraic set V (6) in A”, then Lemma 10.17 
shows that Bo(V (b)) = V(a)/M Bo(A”) for the homogeneous ideal a = (Bi)! (6) 
in A. Since Va) is Zariski closed in P”, Bo(V(6)) is exhibited as closed in the 
relative topology on fo(A”). 

Conversely suppose that C is closed in the relative topology on Bo(A"). Then 
it is of the form C MN fo(A”) for some projective algebraic set C. The set C is of 
the form V (a) for some homogeneous ideal a. If 6 = Bi(a), then Lemma 10.17 
shows that 


Bo(V (6)) = V(a) M Bo(A") = CN Bo(A") = C, 


and C is exhibited as 8) of an affine algebraic set in A”. 


Corollary 10.19. If V is a quasi-affine variety in A”, then Bo(V) is a quasipro- 
jective variety in P”. Moreover, the geometric dimension of V as a quasi-affine 
variety equals the geometric dimension of fo(V) as a quasiprojective variety. 


REMARKS. In other words, the closure Bo(V ) is a projective variety. It is called 
the projective closure of the quasi-affine variety V. If V is actually an affine 
variety, then it has an associated prime ideal in A, and the projective variety Bo(V) 
has an associated homogeneous prime ideal in A. The correspondence between 
the prime ideal in A and the homogeneous prime ideal in A will be examined 
shortly. 


PROOF. Because of the homeomorphism given by Proposition 10.18, Lemma 
10.10 as restated in the lemma’s remarks applies with Y = Bo(A”), X = P”, and 
E equal to the closure of V in A”. The conclusion is that the closure of E in P” 
is a projective variety, and the first conclusion of the corollary is proved. The 
second conclusion is immediate from the version of Proposition 10.11 mentioned 
in the remarks with that proposition. 


To each index i with 0 <i <n, we can associate in a similar way a function 
B; : A” — P”. The formula for 8; is 6;(41,..., Xn) = [yo,.--, Yn], where 
yj = xj41 for j <i, y; = 1, and y; = x; for j > 7. Just as in Proposition 10.18, 
under each §;, the Zariski topology of affine n-space A” coincides with the relative 
topology from P”. One consequence is that the notion of projective closure is 
meaningful if formed relative to any 6; in place of Bp. Another consequence is that 
P” has a covering by n + 1 open sets 6;(A”) that are each Zariski homeomorphic 
to A”. The functions 6; may be viewed as playing a role similar to the inverses 
of charts in the definition of a smooth manifold. 


Having used fp to associate a projective variety in P” to each affine variety in 
A” by passage to the topological closure, we turn to what happens with ideals. 


576 X. Methods of Algebraic Geometry 


Distinct homogeneous ideals in A can map under fj to the same ideal in A; for 
example the principal ideals (1) and (Xo) in A both map to (1) in A. Theorem 
10.20 will show that we can associate a particularly nice ideal of A to each ideal 
of A in such a way that prime ideals of A correspond to those nice ideals of A 
that are prime. Under this correspondence the ideals for an affine variety and its 
projective closure will match. It will be apparent from the construction in the 
proof that the ideal of Ais generated by all homogeneous polynomials F = F(f) 
of the form 
F(Xo,...,Xn) = X4 f (X1/Xo, ..., Xn/ Xo) 


whenever f + 0 is in the ideal of A and deg f = d. 


Theorem 10.20. As a mapping of ideals in A to ideals in A, Bi is one-one 
from the set Z of all homogeneous ideals a of A such that XoF € a implies F € a 
onto the set Z of all ideals of A. Under this one-one correspondence prime ideals 
correspond to prime ideals. 


PROOF. We are going to construct a two-sided inverse to the mapping induced 
by 8) from ideals in T to ideals in Z. 

Let A<, be the k vector space of all members of A, including the 0 polynomial, 
of degree < d. The homomorphism £) carries Ad linearly into A<g, and it carries 
the basis of homogeneous monomials in A of total degree d onto the basis of all 
monomials in A of total degree < d. Thus £j : As — Aq is one-one onto. 
Observe for any f in A<g that the formula 


F(Xo,..., Xn) = X4 f(X1/Xo,...,Xn/Xo) 


defines a member of Ad. If we write F = gg(f) when f and F are related in this 
way, then the function gg is a one-one k linear map from A<, into Aa such that 
a Bo is the identity on Aq. Because of finite dimensionality, 8) : Ad — Acq and 
gq: A<g > Aj are two-sided inverses of one another. 

Suppose that an ideal b in A is given. Define ag = gg(6M A<q), and put 
a = @xo ag. According to remarks in the paragraph with the definition of 
homogeneous ideal, a is a homogeneous ideal if Gag C ag+¢ whenever G is in 
Ae. Define s= B)(G). This polynomial has s deg g < e and ¢.(g) = G, since 
Qe: Ace > Ae is a two-sided inverse of fj : A. —> Az. If f isinbM A<g, then 
gf isin 6M A<a+e), and thus Ggy(f) = Ge(g)Ga(f) = Pat+e(Sf) is iM Gaye. 
This proves that a is a homogeneous ideal in A. 

Under the construction b +> a, let us see that a is in T. If XoF is in 
dg41, then we can write XoF = ga+i(g) for some g in 6M Acai. That 
1S, Kol (Xorveac Xa) = Xe BO Moiese Xl XO) o WEN Opies Xa = 
X49(X1/Xo, ...,Xn/Xo). This formula shows that g is in A<g and that F = 


3. Projective Algebraic Sets and Projective Varieties 577 


ga(g). Hence F is in| a,. In other words, the construction b +> acarries members 
of Z to members of Z. 
Under the construction 6 + a, the homogeneous ideal a has the property that 


By (a) = B5( B aa) = YS Bj (aa) = Yo (6M Aca) = b. 
d=0 d=0 d=0 


Thus our construction starting from an ideal of A, passing to an ideal in the set 
T, and passing back to an ideal of A recovers the original ideal of A. 

Now suppose that a is in T, Putb = B)(a). To see that the above passage to a 
member of Z recovers a from b, we are to show that 


an Ag = gq(6N Acq). (x) 
First we establish that 
Bi(aN Aq) = Bila) Aca. (+) 


The inclusion C in (**) is easy because B)(aM Aa) C B5(a) and Bi (Ag) Cc Acg. 
For the reverse inclusion, let f be in Bula fa) A) M A<a for some k. This means 
that deg f < d and that f = £)(G) withG € an Ax. Without loss of generality, 
we may assume that k > d. Let F be the element F = @deg ¢(f) of ass fe 
Then Xy “8! F = gx(f), and Bi(Xp “*! F) = Bige(f) = f = Bi(G). Hence 
xo ! F and G are members of Ax with the same value under B). Since B% is 
one-one on A;, G = Kee fF. Since G is in a and since the ideal ais in Z, F is 
in a. Hence the element xis !F isin al Ag, and it has Bh eee) ty oF 
This proves the inclusion > in (**). Application of gg to both sides of (+) 
proves (+) and completes the proof of the first statement of the theorem. 

We are to show that prime ideals correspond to prime ideals. Let 6 in Z be 
prime, and let a be the ideal in T with Bj (a) = b. Let F and G be homogeneous 
elements in A of respective degrees d and e with FG ina. Then fg lies in b, 
where f = £)(F) and g = £j(G), and one of f and g lies in b because b is 
prime. Say f isin b. Then F = gg(f) lies in the right side of (*) and hence lies 
in the left side. Consequently F is in a, and a is prime. 

Conversely let a in Tbe prime, and let 6 = £}(a). Suppose that f and g are 
members of A with fg inb. Putd = deg f ande = deg g, and define F = gy(f) 
and G = ¢,(g). Then FG = @g+e(fg) is in gaze(6 NM A<a+e), and (*) shows 
that FG isinan Asis. Since a is prime, one of F and G is ina. Say that F is 
ina. Then f = £5(F) is in b, and 6 is prime. 


578 X. Methods of Algebraic Geometry 


Corollary 10.21. The inclusion Bo : A" — P" sets up a one-one correspon- 
dence between the prime ideals in A and those prime homogeneous ideals in A 
that do not contain Xo. 


PROOF. If a is a prime homogeneous ideal in A and XoF is ina, then either 
Xo or F is ina. If we can always exclude Xo from being in a, then F is in a, and 
the condition in the proposition for a to be in Z is satisfied. The rest follows from 
Theorem 10.20. 


Corollary 10.22. Let a be a prime homogeneous ideal of A not containing 
Xo, and let 6 = £j(a) be the corresponding prime ideal of A. Then the Zariski 
closure in P” of Bo(V(6)) is V(a). 


REMARKS. In other words, if an affine variety V has 6 as its ideal in A, then 
the projective closure of V has the corresponding a from Theorem 10.20 as its 
ideal in A. 


PROOF. Corollary 10.19 shows that Bo(V(6)) = V(a’) for some prime homo- 
geneous ideal of A. Since Bo(V(6)) C V(a) by Lemma 10.17 and since V (a) is 
closed in P”, V(a’) © V(a). Arguing by contradiction, suppose that the inclusion 
is strict. Applying 7(-) and using Proposition 10.12b, we obtain a’ > a. Since 
application of V(-) to both sides of a’ > ahas to yield a strict inclusion, we must 
have a’ 2 a. Choose G homogeneous in a’ that is not in a, and put f = 8)G. If 
(x1,...,%X,) is in V(b), then [1, x1,...,X,] is in Bo(V(b6)) C V(a’), and hence 
f(%1,---;%n) = GU, x1,...,X%,) = 0. Thus f is in /(V(b)) = b. Since 
deg f < degG, the construction of a from 6 in the proof of Theorem 10.20 
shows that F = @gegG(f) is ina. Then G and F are members of Agegg with 
B)(G) = f = Bo(F), and we obtain G = F, contradiction. 


EXAMPLE. Twisted cubic from the example in Section 1 and Example 2 in 
Section 2. The prime ideal 6 C k[X, Y, Z] is (Y — X?, Z — X3), and we want 
to find the corresponding ideal a given by Corollary 10.21. Let the additional 
indeterminate in A be W. Applying @> and ; to the respective generators Y — X? 
and Z — X? yields WY — X? and W*Z — X°?. These must be in a. So must 

(W2Z — X*) — X(WY — X*) = W(WZ- XY) 
and X(W°Z — X*) — (WY +. X”)(WY — X’) = W7(XZ —-Y’). 
Since we seek a prime ideal for a and W is not to be ina, WZ — XY and XZ — Y? 
are ina. Thusa > (WY — X?, WZ — XY, XZ — Y”). If c denotes the ideal on 
the right, then a D c and 
Bi(c) = (Y — X?, Z— XY, XZ—Y’) 
=(Y —X’,Z—-X?,XZ— xX) =(¥ — X?,Z— X3) = Bia). 


4. Rational Functions and Regular Functions 579 


To show that a = c, it is enough according to Theorem 10.20 to show that if F 
is homogeneous and W F is in c, then F is inc. The three generators of ¢ are all 
in A>, and thus cM Ag = Ag_2(¢ NM A2). Hence it is enough to show that ¢M A» 
contains no nonzero element divisible by W. Since ¢M Ap2 consists of all linear 
combinations of the three generators, we can check this fact by inspection. The 
result is that a = c. Once we know a, we can compute the projective closure of the 
twisted cubic from Corollary 10.22. We find that it consists of all [w, x, y, z] of 
the form [1, x, x7, x3] together with [0, 0, 0, 1]. We might have guessed this form 
for the projective closure from the parametric realization of the twisted cubic in A? 
and from a passage to the limit, but proceeding in that fashion requires operations 
that we have certainly not justified. 


4. Rational Functions and Regular Functions 


We continue to assume that k is an algebraically closed field and to write A for 
k[X1,..., Xn] and A for k[Xo,..., Xn]. In this section we investigate certain 
classes of k-valued functions on quasiprojective varieties, specifically the “ra- 
tional” functions, the “regular” functions, and the local ring of functions regular 
at a particular point. For each kind of variety that we have introduced (affine, 
quasi-affine, projective, and quasiprojective), there are simple global definitions 
and there are complicated but equivalent local definitions for these notions. The 
complicated definitions have three advantages over the simple ones: they are 
virtually the same for all four kinds of varieties and therefore make it possible 
to work with all kinds of varieties uniformly, they make it possible in practice to 
construct a function by constructing only a local part of it, and they prepare the 
way better for a definition of isomorphism of varieties that does not insist on a 
particular dimension for the ambient affine or projective space. 

In this section we shall first give the simple definitions in the affine and 
quasi-affine cases and then prove results saying that certain more complicated 
local-sounding versions of these definitions amount to the same thing as the 
simple definitions. Then we shall give the simple definitions in the projective and 
quasiprojective cases. Finally we shall relate the quasi-affine and quasiprojective 
cases and show that certain more complicated local-sounding definitions in the 
quasiprojective case amount to the same thing as the simple definitions. 

We begin with affine varieties. Suppose that V = V(p) is an affine variety in 
A", p being a prime ideal in A. The affine coordinate ring of V is A(V) = A/p, 
whichis an integral domain. Let us write the quotient homomorphism A — A(V) 
asat> a. Because of the Nullstellensatz, A(V) can be identified with the ring of 
all restrictions of polynomials to V; in particular, a(P) is meaningful for every 
ae A(V)andPeV. 


580 X. Methods of Algebraic Geometry 


Proposition 10.23. If V is an affine variety in A”, then the points P of V are 
in one-one correspondence with the maximal ideals mp of the affine coordinate 
ring A(V), the correspondence being that mp is the maximal ideal of all members 
a of A(V) with a(P) = 0. 


PROOF. Each mp is a maximal ideal, being the kernel of a multiplicative 
linear functional. In the reverse direction, if m is a maximal ideal of A(V), 
then its inverse image in A under the homomorphism A — A/p = A(V) isa 
maximal ideal M of A containing p, by the First Isomorphism Theorem. The 
Nullstellensatz shows that M consists of all polynomials vanishing at some point 
P. Applying V(-) to the inclusion M > p gives {P} = V(M) C V(p) = V. 
Thus P isin V. 


Members of the field of fractions k(V) of A(V) are called rational functions 
on V, and k(V) is called the function field on V. Rational functions on V are 
not really functions on V in the traditional sense, since their denominators can 
vanish here and there. By way of compensation, an allowable denominator never 
vanishes identically; the reason is that the construction of a field of fractions 
of an integral domain does not involve using the zero element of the integral 
domain in a denominator. If f is a rational function on V and P is in V, one 
says that f is regular at P, or defined at P, if there exist a and b in A(V) 
with b(P) ~ 0 such that f = a/b. In this case, an equality @/b = a’/b’ 
with b(P) # 0 and b’(P) # 0 implies that ab’ = a’b, from which we see that 
a(P)b'(P) = @'(P)b(P) and that a(P)/b(P) = a'(P)/b'(P). Hence f(P) can 
be defined unambiguously as f(P) = a(P)/ b(P). For P in V, the set of rational 
functions on V that are regular at P is a k algebra, as we see by carrying out the 
usual manipulations to add or multiply fractions. This k algebra is denoted by 
Op(V). Ithas A(V) C Op(V) CK(V). 

As in Proposition 10.23, let mp be the maximal ideal of all members a of A(V) 
with a(P) = 0. The localization of A(V) with respect to this maximal ideal is 
exactly Op(V). In fact, the localization is a subring of k(V) because A(V) is an 
integral domain. The members of Op(V) are exactly the quotients f = a/b with 
@ and b in A(V) and with b not in mp. Hence Op(V) = S~!A(V), where S is 
the set-theoretic complement of mp. Thus Op(V) is the asserted localization. It 
has a unique maximal ideal and is called the local ring of V at P. 

A rational function is said to be regular on an open subset U of V if it is 
regular at every point of U. The regular functions on U form a k algebra denoted 
by O(U). In symbols the definition of O(U) is OU) = (/pey Or(V). 

When A(V) is a unique factorization domain, the definition of regular at a 
point is simple enough to implement globally: we write f = @/b in some 
fashion, reduce the fraction to lowest terms, and then read off all the points P 
for which f is defined from the single expression of f as a quotient. Ordinarily, 


4. Rational Functions and Regular Functions 581 


however, A(V) is not a unique factorization domain, and then the definition is 
more subtle, as the following example shows. 


EXAMPLE. V = V(p) with p = (XW — YZ) andn = 4. The polynomial 
XW — YZ is irreducible, and thus V is an affine variety in A*. The affine 
coordinate ring is A(V) = k[W, X, Y, Z]/(XW — YZ). The quotient f = X/Y 
is arational function on V, since Y is not the 0 element of A(V), and the definition 
shows that f is regular at all points (w,x, y,z) of V having y 4 0. From 
XW —YZ =0, we have X/Y = Z/W, and thus f is defined also at all points 
(w,x, y,z) of V having w # 0. For example it is defined at the additional point 
(w,x, y,z)=(, 0, 0, 0). Actually, there exist no members a and bof A(V) with 
f =4/b and b(w, x, y, z) # 0 whenever xw = yz and one or both of w and y 
are nonzero. The details are carried out in Problem 8 at the end of the chapter. 


The set of points P in the affine variety V at which a rational function f on V 
fails to be regular is called the pole set of /. 


Proposition 10.24. If f is a rational function on the affine variety V = V(p), 
then the pole set of f is the affine algebraic set V(a) C V(p) corresponding to 
the ideal a > p of all b € A such that bf is in A(V). 


PROOF. The set a in the statement is an ideal in A that contains p. Hence 
Va) C V(p). If P is in V(p) and f is defined at P, then there are members a 
and b of A(V) with b(P) # 0 such that b jf = 4; any representative of this binA 
lies in a, and consequently P is not in V(a). Conversely if f is not defined at P, 
then no b such that bf is in A(V) has b(P) # 0. That is, no member b of a has 
b(P) £0. So P is in V(a). This proves that the pole set of f is exactly V(a). 


Corollary 10.25. If V = V (p) is an affine variety, then 
A(V) = () Op(V). 
Pev 
REMARKS. In the notation introduced above, the corollary says that A(V) = 
O(V). 


PROOF. The inclusion € follows from the fact that A(V) C Op(V) for each 
P. For the reverse inclusion, suppose that f lies in (\p.y Op(V). Then the 
pole set of f in V is empty. The pole set for f is the set V(a) for the ideal a in 
Proposition 10.24, and it follows from the Nullstellensatz that a = A. Then 1 is 
in a, and the definition of a shows that f is in A(V). 


If we consider the complement of the pole set of f, then we see from Propo- 
sition 10.24 that the subset of V at which f is regular is (relatively) open in V. 
Hence it is empty or dense in V. On the set where f is regular, f is continuous 
into A!, according to the following proposition. 


582 X. Methods of Algebraic Geometry 


Proposition 10.26. If a rational function f on the affine variety V is regular 
on the nonempty open set U of V, then it is continuous from U into A! with the 
Zariski topology (in which the proper closed sets are the finite sets). 


PROOF. It is to be proved that f—! of any finite subset of A! is relatively closed 
in U. Since the finite union of closed sets is closed, it is enough to consider 
f—'({c}) for an element c of k. This is the intersection with U of the pole set of 
1/(f — cc), which is relatively closed in U by Proposition 10.24. 


Now we can give the simple definitions in the quasi-affine case. Let the quasi- 
affine variety U in A” have closure the affine variety V. If f is a rational function 
on V, then Proposition 10.24 shows that f is regular on a nonempty open subset 
of V. Since the intersection of any two nonempty open subsets is nonempty, f 
is regular on a nonempty open subset of U. Therefore it is meaningful to view f 
as a rational function on U. We define the function field of rational functions 
on U to be the same as the function field of V: k(U) = k(V). The definition of 
regular function at P is the same for the quasi-affine variety U as for its Zariski 
closure V, and thus the local ring of U at P is given by Op(U) = Op(V). A 
rational function is said to be regular on the quasi-affine variety U if it is regular 
at every point of U. Since k(U) = k(V), the set of regular functions on U is the 
k algebra OU) = ()\pey Op(U). 

The next step is to prove results saying that certain more complicated local- 
sounding definitions of the above notions amount to the same thing. 


Lemma 10.27. If V is an affine variety, then any two members of the affine 
coordinate ring A(V ) that are equal on a nonempty open subset of V are the same. 


PROOF. Subtracting, we may suppose that a € A(V) is 0 on the nonempty 
open subset U of V. By Proposition 10.26, a is continuous from V into A!. The 
complement of @~!({0}) has to be open in V and disjoint from U, and therefore 
it is empty. So a is everywhere 0 and is the 0 element of A(V). 


Proposition 10.28. Let U be a nonempty open subset of the affine variety V 
in A”. Suppose that fo : U — k is a function with the following property: for 
each P in U, there exist an open subset W of U containing P and polynomials 
a and b in A such that b is nowhere vanishing on W and fp = a/b on W. Then 
there exists one and only one member f of k(V) such that f is regular on U and 
agrees with fo at every point of U. 


REMARKS. For the quasi-affine case the more complicated local-sounding 
definition of “regular function” on U, mentioned in the first paragraph of this 
section, is what is assumed of fo in the statement of this proposition. The 
proposition says that such an fo necessarily comes from a global rational function 
on V that is regular on U in the sense just above. 


4. Rational Functions and Regular Functions 583 


PROOF OF UNIQUENESS. If there are two such members of k(V), then sub- 
tracting them gives a member g of k(V) that is 0 on U. By definition of k(V), 
g = 4@/b with @ and b in A(V) with with b 4 0. Then @ = gb is a member of 
A(V) that is 0 on U. By Lemma 10.27, a = 0 in A(V). Thus gb = Oink(V). 
Since k(V) is a field and b £0, g = 0. 


PROOF OF EXISTENCE. If P is in U, then the hypothesis supplies some open 
subset W of U containing P and members a and b of A with b nowhere 0 on W 
and with fo = a/b on W. Let @ and b be the images of a and b in A(V). Since b 
is not identically 0 on U, b is not the 0 element of A(V). Therefore f = a/b is 
a well-defined member of k(V), and it is regular on W and agrees with fp there. 
If we start with another point P’ and an open subset W’ of U containing P’, then 
we similarly obtain f’ = a’/b’ in k(V) that is regular on W’ and agrees with 
fo there. The open subset WM W’ is nonempty, and a/b = a/b’ on WNW’. 
Therefore b'4 = ba’ on WNW’. By Lemma 10.27, b'a = ba’ as members of 
A(V). Dividing, we obtain f = f’. Since the member f of k(V) is regular on 
an open neighborhood of each point of U, it is regular on U. 


Proposition 10.28 allows us also to give a local-sounding definition of rational 
function and see that it reduces to the original definition. Specifically we consider 
pairs (Uo, fo) with Up nonempty open in the quasi-affine variety U and with fo 
satisfying the regularity condition on Uo in the proposition.!! Say that the pair 
(Uo, fo) is equivalent to the pair (U1, fi) if fo = fi on Uo N U4. This relation is 
reflexive and symmetric. Let us see from the proposition why it is transitive. If 
(Uo, fo) is equivalent to (U;, f;), then the existence part of the proposition yields 
three members of k(V) —one for (Uo, fo), one for (UgN Uj, fo) = (UoNU,, fi), 
and one for (U,, f;). The uniqueness part shows that the first two members of 
k(V) are equal and the last two are equal. Hence they are all equal. Now if 
(Uo, fo) is equivalent to (U1, fi) and (Ui, fi) is equivalent to (U2, f2), then we 
routinely find that (Uo M U1, fo) is equivalent to (U; NM U2, f2). From what we 
have just seen, (Up, fo) is equivalent to (U2, f2), and the relation is therefore 
transitive. We could take the union of all the sets Up appearing in the pairs 
within an equivalence class and obtain the largest domain within U on which 
the rational function in question is regular. This notion for a rational function will 
not be too useful for us, but an analogous notion for rational maps in Section 6 
will be quite handy. 


In similar fashion the local ring O p(U) can be formulated in terms of “germs” 
of regular functions as follows. Fix P in U, and consider all pairs (Uo, fo) such 
that Up is an open subset of U containing P and fo is a scalar-valued function on 


'I That is, for each P in Uo, there exist an open subset W of Uo containing P and polynomials a 
and b in A such that b is nowhere vanishing on W and fo = a/b on W. 


584 X. Methods of Algebraic Geometry 


Up satisfying the regularity condition on Up in the proposition.'” Say that (Uo, fo) 
is equivalent to (U1, f1) if fo = f; on some open neighborhood of U containing 
P. It is easy to see that the result is an equivalence relation. An equivalence 
class is called a germ of regular functions at P. Germs inherit a natural addition, 
scalar multiplication, and multiplication, and the set of germs at P is therefore a 
k algebra. The use of germs is the traditional device in mathematics for isolating 
local behavior of functions in arbitrarily small neighborhoods of points. 


Corollary 10.29. Let U be a nonempty open subset of the affine variety 
V in A”, and let P be in U. To each germ {(Uo, fo)} of regular functions 
at P corresponds one and only one member f of k(V) that is associated via 
Proposition 10.28 to each pair (Uo, fo). Moreover, this correspondence is a k 
algebra isomorphism of the ring of germs onto the local ring Op(U). 


PROOF. If (Uo, fo) and (Up, f5) are two pairs in a germ at P, then the definition 
of germ gives a pair (W, go) such that W is a neighborhood of P contained in 
Up N Us and g agrees with fo and fj on W. Proposition 10.28 supplies unique 
members f, f’, and g of k(V) such that f is regular on Up and agrees with fo 
there, such that f’ is regular on Uj and agrees with fj there, and such that g is 
regular on W and agrees with go there. The uniqueness in the proposition shows 
that f = g and that g = f’. Therefore f = f’. So we have a well-defined map 
of germs into k(V). 

The image f of the pair (Uo, fo) is a member of k(V) that is regular on Uo, 
hence is defined at P. Thus the map on germs is into Op(U). It is ak algebra 
homomorphism because of the definitions of the operations on germs. If the germ 
of (Uo, fo) maps to 0, then fo is the 0 function on Uo, and any representative 
(W, go) of the germ with W C Up has go equal to the 0 function on W. Thus the 
germ is the 0 germ, and the k algebra homomorphism is one-one. Finally if f isa 
member of Op(U), then f = a/ b with @ and b in A(V) and with b nonvanishing 
at P. By Proposition 10.26, b is nonvanishing on some open neighborhood Up 
of P. Then the germ of (Uo, fo) maps to f if fo is defined as the restriction of 
a/b to Up. Therefore the k algebra homomorphism is onto Op(U). 


This completes the discussion of the definitions in the cases of affine and quasi- 
affine varieties. Next we consider projective varieties, beginning with the simple 
definitions. Let V = V(p) be a projective variety, p being a prime homogeneous 
ideal in A different from €),., Aa. The integral domain A(V) = A/p is called 
the homogeneous coordinate ring of V. Since p is homogeneous, we can write 
A(V) as 


Av) = ® A, /Aanp) = @ Avy. 
d=0 d=0 


2 See the previous footnote. 


4. Rational Functions and Regular Functions 585 


Let us write the quotient homomorphism A = | (V) as F +> F. We say that F 
is homogeneous of degree d if it lies in A(V)g = Aa/ (Aa Np). 

Despite Proposition 10.12, homogeneous members of A (V) do not yield well- 
defined functions on V, and we cannot simply imitate the affine case in defining 
the function field of V. The function field k(V) of V is a certain proper subfield 
of the field of fractions of A(V), namely the set of all quotients F/G with F and G 
homogeneous of the same degree and with G ¥ 0. If the common degree of F and 
G is d, then the quotient F/G is homogeneous of degree 0 in (xo, ..., Xn) and is 
therefore well-defined on the equivalence class [xo, ..., X,] in P”. Such quotients 
forma field because if F; and G; are homo geneous of degree d and F and G2 are 
homogeneous of degree e, then F/G, + F2/G2 = (F,;G2+G, F2)/[(Gi G2) 
and (F, F2) / (G; G2) are each the quotient of two members of A(V) that are 
homogeneous of degree d + e, the denominator not being the zero element, and 
because the inverse of F/G is G/F. Elements of k(V) are called rational 
functions on V. 

Although the values of homogeneous members of A are not meaningful on 
P", the zero locus of such a polynomial is well defined. If F is a member of 
the quotient A(V) homogeneous of degree d, then its set of preimages in Aa 
is F + (Ag Mp). The members of Ag M p all vanish at every point of V, and 
therefore whether F vanishes at a point P of V depends only on the coset of F in 
A(V). Accordingly, a member h of k(V) is said to be regular at the point P = 
[xo,...,Xn] of V, or defined at P, if h can be written as a quotient h = F / G of 
honiseencous members of A(V) of the same degree in sucha way that G(P) 4 0. 
In this case, h(P) is well defined as the quotient F(xo, ...,%»)/G(xo0,---, Xn) 
for any (Xo, ..., X,) representing the point P = [xo,..., X,]. 

The set of points P in the projective variety V at which a rational function h 
on V fails to be regular is called the pole set of h. The proof of the following 
result is similar to the proof of Proposition 10.24 and is therefore omitted. 


Proposition 10.30. If h is a rational function on the projective variety V = 
V(p), then the pole set of h is the projective algebraic set V(a) C V(p) corre- 
sponding to the homogeneous ideal a > p generated by all homogeneous G € A 
such that Gh is in A(V). 


As in the case of affine varieties, the set of members of k(V) regular at P in V 
is ak subalgebra of k(V) called the local ring of V at P and denoted by Op(V). 


Corollary 10.31. If V = V(p) is a projective variety, then 


k= () Op(V). 


PeV 


586 X. Methods of Algebraic Geometry 


REMARKS. The classical prototype of this corollary is that a rational function 
without poles on the Riemann sphere is constant. A direct proof of this fact for the 
Riemann sphere in the style of this book follows by applying Proposition 6.9 to 
the sum of the given rational function and any constant function. A generalization 
appears as Corollary 9.4. 


PROOF. The inclusion C is automatic. For the reverse inclusion, suppose that 
the rational function h on V lies in ()pey Op(V). Then the pole set of h in V 
is empty. The pole set for h is the set V(a) for the ideal a in Proposition 10.30, 
and it follows from the homogeneous Nullstellensatz (Proposition 10,12a) that 
A n C afor all N sufficiently large. For any s such N, A(V) nh lies in A(V). It is 
homogeneous of degree N and hence is in A(V)y. Iterating this inclusion gives 


A(V)y ht C A(V)y_ forall k > 0. (x) 


Since V is nonempty, some X; is not in p; to fix the notation, let us suppose 
that Xo is not in p. Then Xo # 0. Inclusion ie, shows that Xo ht lies in A(V) 
for all k => 0. Thus ny lies in the subset Xo _ NAV) of the field of fractions of 
A(V), and the ring A(V){Al, given by the substitution homomorphism Xph 
applied to the polynomial ring AWV)[X] I. is exhibited as an A(V) submodule of 
the finitely generated A(V) module Xo | NA(V) of the field of fractions of A(V). 

Since A(V) is Noetherian as a homomorphic image of A, A(V)[A] i is a finitely 
generated A(V) module. By Proposition 8.35 of Basic Algebra, h is a root of 
some monic polynomial in A(V)[X]. Say that h satisfies 


hi to jhit+.--+eh+eo =0 


with each c; in A(V). Decomposing each term into homogeneous parts and 
equating to 0 the sum of the terms homogeneous of degree 0 shows that we can 
assume each c; to be in A(V)o = = k. That is, we may assume that h is algebraic 
over k. Since k is algebraically closed, h is in k. 


If we consider the complement of the pole set of h, then we see from Proposition 
10.30 that the subset of V at which h is regular is open in V. Hence it is empty 
or dense in V. On the set where / is regular, h is continuous into A', according 
to the following proposition, whose proof is the same as for Proposition 10.26. 


Proposition 10.32. If a rational function hf on the projective variety V is 
regular on the nonempty open set U of V, then it is continuous from U into A! 
with the Zariski topology (in which the proper closed sets are the finite sets). 


4. Rational Functions and Regular Functions 587 


The procedure for extending the above remarks from projective varieties to 
quasiprojective varieties is the same as for extending the earlier remarks from 
affine varieties to quasi-affine varieties. Let the quasiprojective variety U in P” 
have closure the projective variety V. If h is a rational function on V, then 
Proposition 10.32 shows that / is regular on a nonempty open subset of V. Since 
the intersection of any two nonempty open subsets is nonempty, / is regular on 
a nonempty open subset of U. Therefore it is meaningful to view h as a rational 
function on U. Thus we define the function field of U to be the same as the 
function field of V: k(U) = k(V). The definition of regular function at P is 
the same for the quasiprojective variety U as for its Zariski closure V, and thus 
the local ring of U at P is given by Op(U) = Op(V). A rational function is 
said to be regular on the quasiprojective variety U if it is regular at every point 
of U. The set of regular functions on U is ak algebra denoted by O(U). Thus 


OU) = (| Op(U). 


PceU 


For the special case that U = V, Corollary 10.31 shows that O(V) reduces to the 
constants. 


The next step is to check that the simple definitions in this section in the affine 
and quasi-affine cases are consistent with the simple definitions in the projective 
and quasi-projective cases. Proposition 10.18 and Corollary 10.19 tell us the 
extent of the overlap—that any of the mappings 6; : A” > P" withO < j <n 
allows us to identify any quasi-affine variety with a quasiprojective variety. Thus 
what we need to show is that the definitions of function field, functions regular at 
a point, and functions regular on a variety amount to the same thing for a quasi- 
affine variety U and for the quasiprojective variety 6;(U). For concreteness we 
shall take j = 0. 

Corollaries 10.21 and 10.22 tell us exactly what we are to compare. The prime 
ideals a of A not containing Xo are in one-one correspondence with the prime 
ideals 6 of A, the correspondence being 6 = £)(a), and the Zariski closure of 
V (Bo (6)) in P” is V(a). The correspondence does not yield a natural map of 6b 
into a. Instead, the system of linear mappings gg : A<qg > Aa given by 


F(Xo,.-., Xn) = Ga(f)(Xo, «+s Xn) = XG f(X1/ Xo,» Xn/ Xo) 


is a system of inverses to the system of restrictions 6j|<_ : Age Acq of the 


ay | Aa 
homomorphism fj: A > A given by 


FOG oh A = BNP) Oi OG) SF XG 2 


588 X. Methods of Algebraic Geometry 


and these systems have the properties that 
aN Ag=galbN Aca) and = Bi(aN Ay) =b6N Aggy. 


Proposition 10.33. Let a prime ideal a of A not containing Xp correspond to 
the prime ideal 6 of A under the formula 6 = £(a) as in Theorem 10.20, and let 
U = V(6) and V = V(a) be the respective affine and projective varieties for b 
and a, V being the Zariski closure of By(U) in P”. Then fj descends to a ring 
homomorphism w of A(V) onto A(U), and y in turn induces a canonical field 
isomorphism VY : k(V) — k(U). Under the field isomorphism W, the image of 
the local ring Og,(p)(V) is Op(U) for each P in U. 


PROOF. Since £j carries A onto A and carries a into 6, B descends to a 
homomorphism w of A/a = A(V) onto A/b = A(U). If F and G are in 
the same homogeneous summand AWV)a of A(V), then we define W(F/G) = 
w(F)/w(G) as a member of the field of fractions k(U) of A(U). If F/G = 
F / G,thn FG =F G. Applying w, using that y is a homomorphism, and 
reinterpreting matters in k(U), we see that V( F/G) = W(F /G), ie., that V is 
well defined. A similar argument that involves clearing fractions and applying 
shows that Y respects addition and multiplication. Therefore V is a field mapping 
of k(V) into k(U). 

Let A(U) <q be the i image of A<g in A/b = A(U). Since Bj carries Aa onto 
Acq and carries aN Aa onto 6M A<g, w carries A(V)a onto A(U)<g. Any 
member of k(U) is the quotient of two members of A(U)<¢ for some d, and 
it is consequently W of the quotient of the corresponding members of AW)a. 
Therefore W carries k(V) onto k(U) and is a field isomorphism. 

Let F and G in A(V) be the cosets F + aand G +a, let f = Bj(F) and g = 
Bo(G), and let f and Z in A(U) be the cosets of f +6 and g+b. Then w(F) = f 
and W(G) = = g,andhence W(F/G) = f /z. Let P = (x,..., Xn) beinU, sothat 
Bo(P) = (1, x1,..., Xn] is in Bo(U). Define B3(P) = (1,x1,...,%n) in A"™, 
so that the class of Bi(P) in P” is Bo(P). Then g(P) = g(P) = (B)G)(P) = 
G(Bi(P)) = G(Bi(P)). Therefore f/Z lies in Op(U) if and only if F/G lies 
in Og p)(V). So W carries Opp) (V) onto Op(U). 


Corollary 10.34. Let V be a projective variety, and let U be a nonempty open 
subset of V. Then each member of O(U) C k(V) is determined as an element 
in k(V) by its restriction to U. 


PROOF. Subtracting two such members, we may assume that their difference 
hisOon U. We are to prove that h = 0 in k(V). For some j withO < j <n, 
Bj(A") M V is nonempty, and we may assume that this is the case for j = 0. 
The subset Vo = Bo. '(V) of A” is an affine variety. Since U and Bo(A") N V are 


4. Rational Functions and Regular Functions 589 


nonempty open subsets of V, their intersection is nonempty, and Uy = Bo '(U) is 
a nonempty open subset of Vo. Let YW : k(V) — k(Vo) be the field isomorphism 
in Proposition 10.33. By assumption, h is in Og p)(V) for every P in Up. Since 
the value of h at P is 0, h is actually in the maximal ideal of Og,(p)(V) for P 
in Up. Proposition 10.33 shows that Y(/) is in the maximal ideal of Op(Vo) for 
all P in Up. Fix Po in Up. Then we can write the member W(h) of k(Vo) as 
W(h) = a/b with b(Po) # 0. Since b is continuous on Vo by Proposition 10.26, 
b(P) is nonzero for all P in some neighborhood W of Po contained in Up. Then 
the formula Y(h) = @/b shows explicitly that Y(h) is defined at such points 
P and satisfies V(h)(P) = a(P)/b(P). Since W(/) is in the maximal ideal of 
Op(Vo) for all P in Up, V(h)(P) = 0 for P in W. Hence a(P) = 0 for P in 
W. Consequently a and 0 are two members of A(V) that are equal on W, and 
Lemma 10.27 allows us to conclude that a = 0. Therefore h = 0. 


Proposition 10.35. Let U be a nonempty open subset of the projective variety 
V in P”. Suppose that ho : U — kis a function with the following property: for 
each P in U, there exist an open subset W of U containing P and homogeneous 
polynomials F and G in A of the same degree such that G is nowhere vanishing 
on W and hg = F/G on W. Then there exists one and only one member h of 
k(V) such that / is regular on U and agrees with ho at every point of U. 


REMARKS. For the quasiprojective case the more complicated local-sounding 
definition of “regular function” on U, mentioned in the first paragraph of this 
section, is what is assumed of ho in the statement of this proposition. The 
proposition says that such an hg necessarily comes from a global rational function 
on V that is regular on U in the sense just above. 


PROOF. For each j withO < j <n such that V; = 6;(A”) 1M V is nonempty, 
B '(V;) is an affine variety, and U; = UNV; is anonempty open subset such that 
hio = ho| U; is a function on U; with the following property: for each P in Uj, 
there exist an open subset W of U; containing P and homogeneous polynomials 
F and G in A of the same degree such that G is nowhere vanishing on W and 
hjo = F/G on W. We pull back this situation by 8; writing Bih ;,0 for the 
function on Br (W) given by (B/hj,.0)(Q) = hj,o(Bj(Q)). The set Bi" (V;) is an 
affine variety, and the Zariski closure of V; in P" is V. The homomorphism 6 i on 
A descends to a ring homomorphism y; : A(V) > A(B;! (V;)), and w; induces 
a field isomorphism W; : k(V) > k(B; '(Vj)), according to Proposition 10.33. 

The set BU j) is a nonempty open subset of the affine variety 6; (Vj), 
and Bih jo 18 a function on B; ‘VU, j) with the following property: for each P in 
ee (Uj), there exist an open subset W of a (U;) containing P and homogeneous 


590 X. Methods of Algebraic Geometry 


polynomials F and G in A of the same degree such that their images F andG 
in A(V) have G nowhere vanishing on W and have Bih 1,0 = = 0(F)/V; (G) = 


W(F/G) on W. Proposition 10.33 says that Wi(F) = = a and wW; (G) = b for 
members a and b of A(B;! (V;)). We are in the situation of Proposition 10.28 with 
fo= B; hj,o, and that proposition produces a unique member h; of k(B; | (V;)) 
that is regular on Be (U;) and agrees with Bihj.o at every point of B;'(Uj). 
The member h of k(V) that we seek is h = wl (h;). To verify this assertion, 
we are to show that wl (h;) is independent of j. Thus suppose that V;NV; 4 ©. 
Fix P in U; NU; = U OV; V;, and choose the above open neishborhood Ww 
of P small enough for the above construction to apply for both indices i ang iE 
By the uniqueness in Proposition 10.28, 4; is the unique member of k(p>" (V;)) 
that is regular on B,'(W) and agrees with Bi hijo = Bi (ho| u)) at every point of 
B;'(W). Thus Wl (hj) = F/G on W, where F and G are as in the previous 
paragraph. By the same mniguEness argument, i (hi) = F/G on W. The 
difference V,- ‘Oh, )- a ‘(hj ) is a member of k(V) that is regular on W and 
vanishes there By Corollary 10.34, the difference is 0 as an element of k(V). 


Therefore Wc ‘(hj ) is independent of j, and we can take h to be this member of 
k(V). , 


Just as in the quasi-affine case, it is possible in the quasiprojective case to give 
a local-sounding definition of rational function and a formulation of Op(U) in 
terms of germs. We shall not use these notions, and we omit any further discussion 
of them. 


5. Morphisms 


The goal of this section and the next is to introduce maps that make the collection 
of all quasiprojective varieties over an algebraically closed field k into the objects 
of a category in a way that does not depend on the ambient space A” or P” of 
the variety. These maps will all be algebraic in nature, and there will be two 
choices of which class of maps to use, one involving good denominators and 
one allowing occasional bad denominators. The first kind of map will be called 
a “morphism,” and the second kind of map will be called a “dominant rational 
map.” The relationships between these two kinds of maps and the interpretation 
of these maps in terms of function fields will be of great importance in applying 
this theory. 

A variety over the algebraically closed field k henceforth will be any affine, 
quasi-affine, projective, or quasiprojective variety as in the previous sections. To 


5. Morphisms 591 


each such variety V, Section 4 associates a function field k(V), a local ring 
Op(V) © kK(V) of regular functions at each point P, and a ring O(E) = 
(per Or(V) © k(V) of regular functions on each nonempty open subset E 
of V. We have observed that each rational function on a variety V is regular on 
some nonempty open subset of V, namely the complement of the pole set. One 
further fact that we shall use about rational functions is the following. 


Proposition 10.36. If P and Q are distinct points of a variety V, then there 
exists a rational function h € k(V) such that h is defined at both P and Q, has 
h(P) = 0, and has h(Q) 4 0. 


PROOF. Without loss of generality, we may assume that V is projective. Say 
that Ve Pp”. Let p be the prime homogeneous ideal in A = k[Xo, . wes Xn | such 
that A(V) = A/p, and let F +> F be the quotient homomorphism A +> A(V). 
Let P = [x0,..-, Xn] and QO = [yo, ..., yn]. Choose a homogeneous polynomial 
F in A such that F(x,...,%,) = O and F(j,..., yn) 4 O, and choose a 
homogeneous polynomial G with deg G = deg F such that G(x0,..., Xn) #0 
and G(yo,..., Yn) # 0. Then G is not 0, and h = F/G has the required 
properties. 


If U and V are varieties, then a continuous function g : U — V relative to the 
Zariski topology is called a morphism if for each nonempty open subset F of V 
and each regular function f on E, the composition f o ¢ is a regular function 
on the open subset g~!(E) of U. Thus ¢ is to be continuous and is to induce by 
composition a function from O(E) into O(y~!(E)) for each open subset E of V. 
An isomorphism of varieties is a morphism having an inverse function that is a 
morphism. 

It is immediate that the composition of two morphisms is a morphism and that 
the identity function is a morphism. Thus the varieties over k form a category if 
morphisms are used as the maps. 


EXAMPLES OF MORPHISMS. Suppose that k has characteristic different from 2. 
Let U be P', written as 


P! = {Is, 411 (8,1) 40,0}, 


and let V be the projective variety in P* defined by the irreducible homogeneous 
polynomial X? + Y? — Z?, ie., 


V = {Ex, y, 2] |x? +? = 2? and (x, y, z) # ©, 0, 0)}. 
Let g : U — V be the function given by 


g([s, t]) = [s* —t?, 2st, 5? +27]. 


592 X. Methods of Algebraic Geometry 


This is well defined, and it is continuous because the Zariski closed proper subsets 
of V are the finite sets, whose inverse images are finite sets. If F and G are 
two homogeneous members of k[X, Y, Z] and if F and G are the images in 
A(V) =k[X, Y, Z\/(X? + Y* — Z”), we are to assume that G is not 0, ie., that 
G is not divisible by X* + Y* — Z?, and then h = F/G is a typical rational 
function on V. We are to show that if / is regular on an open subset E of V, then 
hog is regular on g~!(E) C P!. The expression h = F'/G exhibits h as regular 
on the open set F of points [x, y, z] of V with G(x, y, z) 0. The set gy (EB) 
is the set of points [s, t] in P! with G(s? — 17, 2st, s* +17) 40. At such points 
the function h o ¢ is given by 


(ho p)(s, t) = F(s? —#’, 2st, s* + 17)/ G(s — #7, 2st, s* +17), 


and it is given by a rational expression with nonvanishing denominator. Thus @ 
is a morphism. 
Let us see that y : V > P! given by 


[Ix +z, y] if [x, y, z] #[1,0, —1], 


vis yal = | [-y,x-z]  iffx,y,z1 40,0, 1] 


consistently defines another morphism. For the consistency we observe that 
x*+y? = 2 implies that (x + z)(x —z) = — y?; hence on the common domain 
of the two expressions, [x + z, y] = [—y?/(x — z), y] = [-y/@ — 2), 1] = 
[—y, x — z]. Continuity of yy follows because the inverse image of any finite set 
is a finite set. For the regularity we observe that if F and G are homogeneous 
members of the same degree in A(P') = k[S, T] with G 4 0 and if h = F/G, 
then the expression h = F/G exhibits h as regular on the open set F of points 
[s, t] in P! with G(s, t) 4 0. The set y—!(£) is the set of points [x, y, z] on V 
with G(x + z, y) #0. At such points the function h o yf is given by 


(how)[x, y,z] = F(x+z, y)/G(x +z, y), 


and it is given by a rational expression with a nonvanishing denominator. Thus 
w is a morphism. In other words, g is an isomorphism. 


Proposition 10.37. Let Bo : A” — P" be the usual inclusion. If U is a 
quasi-affine variety in A”, then Bo is an isomorphism of the quasi-affine variety 
U onto the quasiprojective variety Bo(U). 


PROOF. Proposition 10.18 shows that Bo is a homeomorphism of U onto its 
image. The last conclusion of Proposition 10.33 implies that the regular functions 
for U match those for By(U) under fo, and the result follows. 


5. Morphisms 593 


Theorem 10.38. Let U be any variety, let V be any affine variety, and let 
A(V) be the affine coordinate ring of V. Then the morphisms g : U — V are in 
one-one correspondence with the k algebra homomorphisms g : A(V) > O(U) 
via the formula 


G(f)=fog for fe A(V). 


REMARKS. Members f of A(V) lie in O(V). The k algebra homomorphism 
@ is meaningful because the fact that y is a morphism implies that f o @ is in 
O(y7!(E)) for every open E in V; here we take E = V and gy \(E) =U. The 
proof of Theorem 10.38 will be preceded by a lemma. 


Lemma 10.39. If U is a variety and V is an affine variety in A”, then a function 
y : U > V isa morphism if and only if X; o is a regular function on U for 
the image X; in A(V) of each coordinate function X; with 1 <i <n. 


PROOF. If yy is a morphism, then the definition of morphism forces X; 0 yy to 
be a regular function. 

Conversely suppose yw has the property that each X; o y is a regular function. 
Then f o w is aregular function on U for each f in A(V), since every member 
of A(V) is a polynomial in the elements X;. If E is a closed set in V, then E is 
the locus of common zeros of some set { fy} of polynomials, and y~!(£) is the 
set of points P such that fy(w(P)) = 0 for all a. Hence w—!(E) is the locus of 
common zeros of a subset { fy o w} of regular functions on U and is relatively 
closed in U. Thus y is continuous. 

If E is nonempty open in V, then k(Z) = k(V) shows that each regular 
function h on E is locally the quotient of members of A(V) with nonvanishing 
denominator. Let us write h = f/g with g nonvanishing near a point of interest. 
Then hoy = (f o &)/(g o y) is exhibited locally as a rational function with 
nonvanishing denominator. 


PROOF OF THEOREM 10.38. Suppose that a : A(V) > O(U) is ak algebra 
homomorphism. Define y : U > V by W(P) = (a@(X1)(P),...,@(Xn)(P)). 
Then X; 0 v= a(X;) is in O(U) by definition of a, and Lemma 10.39 shows 
that y is a morphism. 

The k algebra homomorphism v defined by v( ff) = fow has wv (%) = 
X; ow = a(Xj). Since the elements X; generate A(V), W = a. Thus starting 
from a, forming yy, and obtaining v recovers a. In the reverse direction if we 
start from ¢, form @, and use the construction of the previous paragraph to obtain 
w, then W(P) = (@(X1)(P),..., G(Xn(P)) = (Xi((P)), --- Xn((P))) = 
gy(P) for Pin U. Hence y = g. Thus the functiona +> vy is a two-sided inverse 
of the function g + @. 


594 X. Methods of Algebraic Geometry 
Corollary 10.40. If U and V are affine varieties, then the morphisms 


gy : U — V are in one-one correspondence with the k algebra homomorphisms 
@: A(V) — A(U) via the formula 


G(f)=fog for fe AV). 


PROOF. This is immediate from Theorem 10.38, since Corollary 10.25 shows 
that O(U) = A(U). 


Proposition 10.41. If U and V are varieties andifg : U > Vandw:U > V 
are morphisms such that | p= | , for some nonempty open set E in U, then 


g=y. 


PROOF. Let / be a rational function on V, and let E’ be the nonempty open 
subset of V on which A is regular. Since g and w are morphisms, hog andhowy 
are regular on the respective nonempty open subsets y~!(E’) and w—!(E’) of U. 
The equality | r= | r Shows that h o g and h o w are equal on the nonempty 
open subset EN g~!(E’)Nw7!(E") of U. The function hoy — ho w is therefore 
a rational extension from EM g~!(E’) N w7!(E’) to U of the 0 function, and 
Proposition 10.34 shows thathog—how =OonU. Thereforehog=how 
as elements of k(U) for every h in k(V). 

Arguing by contradiction, suppose that P is a point in U for which g(P) # 
w(P). Then Proposition 10.36 produces h in k(U) such that h is regular on 
an open subset F of V containing g(P) and w(P) and has h(g(P)) = 0 and 
h(W(P)) 4 0. Since g and w are morphisms, h o g andh o w are regular on the 
open set p'(F) 1 w7!(F). Their respective values at P are h(y(P)) = 0 and 
h(w(P)) 40. Since h og = ho was rational functions, this is a contradiction. 


Proposition 10.42. Suppose that U and V are varieties and that g : U > V 
is a morphism. If P is in U, then @ induces a k algebra homomorphism 
gp : Ogp)(V) + Op(U). Composition of morphisms goes to composition 
of these homomorphisms in the reverse order. 


Proof. Propositions 10.33 and 10.37 together imply that we may assume U and 
V to be quasi-affine. Let f in k(V) be defined at g(P). Proposition 10.24 shows 
that the set EF on which f is regular is open in V. Since g is a morphism and f is 
regular on E, f og is regular on the open subset y~'(E) of U. Proposition 10.28, 
applied to g~!(E) © U, shows that there exists a unique member F of k(U) that 
is regular on g~!(E) and agrees with f og on g|(E). We put orf) = F. 
It is a routine matter to check that yj is a k algebra homomorphism and that 
compositions go to compositions in the reverse order. 


6. Rational Maps 595 
6. Rational Maps 


This section will introduce a second kind of map that makes the collection of all 
(quasiprojective) varieties over the algebraically closed field k into a category. 
These maps will not be ordinary functions, and the definition requires some care. 

If U and V are varieties over the algebraically closed field k, then a rational 
map g : U — V isan equivalence class of pairs (EF, g¢), where EF is anonempty 
open set of U and ¢¢ is a morphism of EF into V. The equivalence relation on two 
such pairs is that (E, gz) ~ (E", ve) if ve | Eng! = GE! eee This is meaningful, 
since the intersection of any two nonempty open sets is nonempty. The relation 
~ is certainly reflexive and symmetric, and Proposition 10.41 shows that it is 
transitive. We can therefore take the union of the open subsets E such that some 
pair (E, gz) is in the equivalence class, and ¢ will be definable as a morphism on 
this union. This union is called the largest domain on which ¢ is a morphism. 

A morphism from U to V defines a rational map. But a rational map need not 
be an everywhere-defined function, and forming the composition of two rational 
maps is problematic. For example, if E is the open subset of U on which a rational 
map g : U — V is defined and F is the open subset of V on which a rational 
map yw : V — W is defined, then it may happen that g(£) is disjoint from F’. In 
this case the composition y o g makes no sense. 

A rational map g : U — V is said to be dominant if gg has dense image in 
V for some (and hence every) pair (EZ, gz) in the equivalence class. It is evident 
that the composition of two dominant rational maps makes sense as a rational 
map. The identity mapping is a dominant rational map, and thus the collection 
of all varieties over k becomes a category if the dominant rational maps are used 
as the maps of the category. 

A birational map is a dominant rational map g : U — V that has a dominant 
rational map y : V — U as a two-sided inverse. Two varieties admitting a 
birational map from the one to the other are said to be birationally equivalent 
varieties, or to be birational. 


EXAMPLE. The irreducible affine plane curves defined by T? — (S* + 1) and 
Y* —(X?—4X) are birationally equivalent if k has characteristic different from 2. 
Birational mappings in the two directions are given by 


res ve? 

~ 2X TS 
2 op and 4 

he ae y= ; 

4x2 T — S? 


The rational map from (X, Y) to (S, 7) is a morphism on the complement of 
(0, 0) in the locus y? = x? — 4x in A?. The rational map from (S, T) to (X, Y) 
is a morphism on the entire locus t7 = s* + 1 in A’. 


596 X. Methods of Algebraic Geometry 


Let g : U — V bea dominant rational map, and let (Z, gz) be any pair in the 
equivalence class g. If f € k(V) is arational function on V, then the subset F of 
V on which f is defined is open and nonempty. So f | py 18 aregular function on F. 
Since gg is continuous and has dense image, E’ = Or" (F) is a nonempty open 
setin E C U. The function yg is a morphism from E£’ into F, and thus f | p OPE! 
is a regular function on E’. We can therefore regard it as a rational function on 
U, i.e.,a member of k(U). Consequently the dominant rational map g : U > V 
induces a function @ : k(V) — k(U) that is easily seen to be a field mapping 
respecting k. Compositions of dominant rational maps lead to compositions of 
such field mappings in the reverse order. 


EXAMPLE, CONTINUED. The two irreducible affine plane curves in the example 
earlier in this section have been observed to be birationally equivalent. In view 
of the previous paragraph, their function fields must be isomorphic. Taking into 
account that the genus of a curve, as defined in Section [X.3, depends only on the 
function field, we see that the two curves must have the same genus. This equality 
is confirmed by Example 3 of genus in Section IX.3, which shows that the genus 
of k[x, y]/(y — p(x)), where p(x) is a square-free polynomial of degree m in 
characteristic different from 2, is sm — 1 ifm is even and is $(m — 1) if m is odd. 
The two curves under study have m = 4 and m = 3, and the genus is 1 in both 
cases. 


The main result of this section will be a converse to the construction just made, 
showing how to pass from a k algebra homomorphism between function fields to 
a dominant rational map in the reverse order. We require two lemmas. 


Lemma 10.43. Let V = V(f) be the hypersurface!’ in A” defined by a non- 
constant polynomial f ink[X,,..., X,,]. Then the open set A” — V is isomorphic 
to an affine variety, specifically to the hypersurface in A”*! corresponding to the 
irreducible polynomial X41 f(X1,..., Xn) — Lin k[X1,..., Xn41]. 


REMARKS. Eventhough f is not assumed irreducible, X41 f —1 is irreducible. 
In fact, consideration of the degree in X,,,; shows that the only possible nontrivial 
factorization is of the form (X,4,a — b)(c) with a, b,c ink[X,,..., X,]. Then 
bc = 1, and c has to be scalar. The open set A” — V is a quasi-affine variety 
(having closure A”), and the lemma therefore asserts that this quasi-affine variety 
is isomorphic to a certain affine variety in A”*!. 


Proor. Let W = V(Xn4i1f — 1). Let g : W — A” be the map defined by 
O(xX1,-. +, Xn41) = (1... Xn) for 1, ..., Xn41) in W. Then X;o¢ is projection 


'31n the application of Lemma 10.43 to Lemma 10.44, it is important that the polynomial f is 
allowed to be reducible. 


6. Rational Maps 597 


to the jo coordinate for 1 < j <n, which is a regular function on W. Lemma 
10.39 shows that g is a morphism, and ¢ is one-one onto by inspection. The 


inverse function is given by g(x], ...5%n) = (x1, acta Mypeidd f (Kis eewrs ti): 
Let X; be the image of X; in k[X1,..., Xn41]/(Xn4i f — 1) forl < j<n+l. 
Then (Xj 0 g(x, ...,Xn) equals x; for j < n and equals 1/f(x1,..., Xn) for 


j =n-+ 1, and these are regular functions on the complement of V(f) in A”. 
By Lemma 10.39, g~! is a morphism. 


Lemma 10.44. If V is a variety, then there is a base for the Zariski topology 
on V consisting of open sets that are isomorphic to affine varieties. 


PROOF. Let P be in V, and let U be an open subset of V containing P. 
We are to produce an open subset W of U containing P that is isomorphic to 
an affine variety. Since any nonempty open set of a quasiprojective variety is 
a quasiprojective variety, U is a variety. Thus we may assume that U = V. 
Since any projective variety in P” is covered by the affine varieties isomorphic 
via Proposition 10.37 to nonempty intersections with 6;(A”), any quasiprojective 
variety is covered by quasi-affine varieties. Thus we may assume that U = V 
is quasi-affine in A”. Let X be the closed subset X = V — V in A", and let 
a = I(X). Since P is in V, it is not in X, and there exists some f in a with 
f(P) #0. Let Y = V(f). The point P is not in Y, and thus W = V — V(f) is 
relatively open in V and contains P. 

Being relatively open in V, W is a quasi-affine variety. Since f vanishes on X, 
V(f) contains X = V — V. Thus the equality W = V — V(f) exhibits W asa 
relatively closed subset of A” — V(f), which Lemma 10.43 shows is isomorphic 
to an affine variety. Hence W itself is isomorphic to a quasi-affine variety that is 
closed in an affine variety. That is, W is isomorphic to an affine variety. 


Theorem 10.45. Let U and V be varieties, and let y +> @ be the function 
carrying dominant rational maps g : U — V to field mappings g : k(V) > k(U) 
respecting the operations by k and given by 


O(f) = (class of f|,,0 GE"), 


where f is in k(V), f is regular on F, (E, gg) is a pair in the class g, and 
B= op (F ). Then g +> @ is one-one onto the set of all field mappings from 
k(V) into k(U) respecting k. Furthermore, if P € U and Q ¢€ V are points, then 
the maximal ideal of ¢(Og(V)) is contained in the maximal ideal of Op (U) if and 
only if P is in the largest domain on which ¢ is a morphism and has g(P) = Q. 


REMARK. The ring Op(U) is the k vector space sum of its maximal ideal 
and the constants, since evaluation at P is a well-defined multiplicative linear 
functional on Op(U), and a similar comment applies to Og(V). Whatever @ 


598 X. Methods of Algebraic Geometry 


does, it certainly carries 1 to 1, and hence if @ carries the maximal ideal of 
Og(V) to the maximal ideal of Op(U), then it carries Og(V) to Op(U) also. 


PROOF. We begin by inverting g +> ¢@. Lemma 10.44 shows that any variety 
is covered by open subvarieties isomorphic to affine varieties, and the function 
fields of the variety and the subvarieties may all be identified with one another. 
Thus there is no loss in generality in assuming that V is an affine variety in 
A". Let X1,..., X, be the images in A(V) of X,,..., X,, and suppose that a 
k algebra homomorphism y : k(V) — k(U) is given. Then y(X1),-.-, V(Xn) 
are rational functions on U, and we can find a nonempty open subset F of U on 
which all these functions are regular. Since y is a homomorphism, y yields by 
restriction of the images a homomorphism y : A(V) > O(E). Moreover, this 
version of y is one-one on A(V) because y as a field mapping is one-one and 
because Proposition 10.34 shows that each member of O(E) extends in only one 
way to a member of k(U). Theorem 10.38 produces a morphism y : E — V 
such that v = y for this restricted version of y. Then the equivalence class g of 
the pair (y, E) is a rational map of U into V. 

To see that y is dominant, suppose on the contrary that y(E) is a proper 
closed subset of V. Then we can find a polynomial f that is 0 on w(£) but is not 
identically 0 on V. The image f of f in A(V) is nonzero. Since the restricted 
version of y is one-one, y(f) is nonzero in O(E). However, y(f) = W(f) = 
f ow, and the right side is 0 on E, contradiction. 

The construction is arranged in such a way that if we start from gy, form @, 
and go through the construction to produce a rational map of U into V, then 
the resulting rational map is g. In the reverse direction, suppose that we start 
from y, produce g, and then form @, and suppose that f in k(V) is in A(V). If 
E C U isas in the first paragraph of the proof, then a representative of ¢ is the 
pair (E, yz), where y¢ is the morphism such that (yg)” = y. Then yy (f) is the 
class of f o gg, which equals G(f) and hence y(f). In other words, y and @ 
agree on A(V); being field mappings, they agree on k(V). This completes the 
proof of the first conclusion of the theorem. 

Now suppose that y is a dominant rational map from U to V and that ¢ is the 
corresponding field map of k(V) to kU). Let P € U and Q € V be points, 
suppose that there is an open neighborhood E of P such that (F, gz) is in the 
equivalence class gy, and suppose that g¢(P) = Q. Lemma 10.44 shows that 
there is a base of open neighborhoods of Q in V consisting of open sets that are 
isomorphic to affine varieties. Since gg is by assumption continuous, we can 
select any such open neighborhood and assume that g¢ carries FE into it. Thus 
there is no loss of generality in assuming that V is isomorphic to an affine variety. 
We associate to yg the k algebra homomorphism (gz) : O(V) > O(E) given 
by (gc) (f) = f ove for f € O(V). This formula shows that the members f 
of O(V) that vanish at Q are carried to members of O(£) that vanish at P and 


6. Rational Maps 599 


that members of O(V) that do not vanish at Q go to members of O(£) that do 
not vanish at P. Therefore (yg) carries Og(V) into Op(E) = Op(U). 

Conversely suppose that the field map ¢ has the property that the maximal 
ideal of G(Og(V)) is contained in the maximal ideal of Op(U). Possibly by 
passing to an open subneighborhood from the outset, we may assume by Lemma 
10.44 that U and V are isomorphic to affine varieties. Dropping the isomorphism 
from the notation, we can write O(V) = A(V) = k[y1,..-., Ym] by Corollary 
10.25. Each @(y;) is a rational function on U, which we can write as 9(y;) = 
a;/b; with a; and b; in O(U) = A(U). The hypothesis on @ implies that 
P(Oo(V)) SC Op(U), hence that each G(y;) is regular at P. Thus we may take 
each denominator b; to have bj(P) 4 0. Choose an open neighborhood of P on 
which all b; are nonvanishing and an open subneighborhood E that is isomorphic 
to an affine variety. Since ¢ respects the field operations, it carries any polynomial 
in y1,..., Ym toa quotient c/d with c andd in O(E) and with d nowhere 0 on E. 
Therefore c/d is in) pep Op (E) = O(E). That is, @ carries O(V) into O(E). 
Since V is isomorphic to an affine variety, Corollary 10.25 and Theorem 10.38 
show that @ : O(V) > O(E) is given by the formula 


p(h)(u) = h(ve(u)) (x) 


for some morphism gg : E — V andallh € O(V) andu ¢€ E. The first part 
of the proof shows that the pair (£, gz) is in the equivalence class g. Hence P 
is in the largest domain on which ¢ is a morphism. Arguing by contradiction, 
suppose that y¢(P) = Q' # Q. Choose by Proposition 10.36 a rational function 
h on V that is defined at both Q and Q’ and has h(Q) = 0 and h(Q’) 4 0. Then 
@ carries Og(V) and its maximal ideal into Op(U) and its maximal ideal, and 
we obtain 0 = G(h)(P) = h(wg(P)) = h(Q’) £ O, contradiction. We therefore 
conclude that g¢(P) = Q, and the proof of the second conclusion of the theorem 
is complete. 


Corollary 10.46. If U and V are varieties, then the following conditions are 
equivalent: 
(a) U and V are birationally equivalent, 
(b) k(U) and k(V) are isomorphic as k algebras, 
(c) there are nonempty open subsets FE of U and F of V such that E and F 
are isomorphic as varieties. 


PROOF. The equivalence of (a) and (b) follows from Theorem 10.45 and the 
fact that composition of dominant rational maps corresponds to composition of 
homomorphisms of k algebras in the reverse order. 

Let us check that (c) implies (a). If (c) holds, let'g: E— Fandy:F—- E 
be morphisms that are inverse to each other. Then the equivalence classes of 


600 X. Methods of Algebraic Geometry 


(E, p) and (F, w) are rational maps from U to V and from V to U, respectively. 
The equivalence class of (E, wo g) = (E, 1g) is the identity rational map on U, 
and the equivalence class of (F, go W) = (F, 1) is the identity rational map on 
V. Hence the rational maps are inverses of one another. This proves (a). 

Finally let us check that (a) implies (c). If (a) holds, let g : U — V and 
w : V — U be rational maps that are inverse to each other. Let (£1, ¢) 
and (Fi, y) be pairs representing g and w. Then a pair representing w o @ 
is (p~!(F), VW o @) because gy is a morphism on the open subset y~!(F) of Ey 
and y is a morphism on the open set F; containing g(y~!(F\)). Since yo @ is 
the identity on U as a rational map, yf o 9 is the identity morphism on g~!(F}). 
Put E = gy !(F\) C Ej. Similarly g o wp is the identity morphism on y~!(E)), 
and we put F = w—!(E\) C F;. Letus see that g(E) C F. Ife isin E, we are to 
exhibit some e; € EF; with w(¢g(e)) in £1, and then ¢g(e) will bein F = wt (£4); 
for this purpose we can take e; = e, since yf o g is the identity morphism on E. 
Similarly w(F) C FE. Thus ¢g and w exhibit E and F as isomorphic varieties. 
This proves (c). 


7. Zariski’s Theorem about Nonsingular Points 


Sections 1-6 have established the definitions and elementary properties of va- 
rieties, maps between varieties, and dimension. The present section concerns 
singularities, which are a fundamental topic of interest in algebraic geometry. '4 
This topic was introduced in Section VIL.5 in a context that we now recognize as 
affine varieties. 

The definition of “nonsingular” was motivated by the classical Implicit Func- 
tion Theorem. Let k be an algebraically closed field, let the affine space in 
question be A”, and let p be the prime ideal such that the affine variety to study 
in A” is V(p). If {f;} is a finite set of generators of p and if P is in V(p), then P 
is said to be a nonsingular point of V (p) if rank [ak (P)| =n —dim V(p), and 
otherwise it is singular. Zariski’s Theorem, which was formulated as Theorem 
7.23 but only partially proved in Chapter VII, addressed this situation. In order 
to rephrase the theorem in our current notation, let A(V) be the affine coordinate 
ring of V, and let k(V) be the field of fractions of A(V), i.e., the function field 
of V. Let mp be the maximal ideal of all members of A(V) vanishing at P, and 
let Op(V) be the local ring at P; this is the localization of A(V) with respect to 
the maximal ideal mp and is a subring of k(V). The maximal ideal of Op(V), 
consisting of all members of k(V) defined and vanishing at P, will be denoted 
by Mp. Theorem 7.23, translated into this notation, is as follows. 


'4The exposition in this section is based in part on Chapter I of Hartshorne’s book, Chapter III 
of Reid’s book, and Chapter II of Volume 1 of Shafarevich’s books. 


7. Zariski’s Theorem about Nonsingular Points 601 
Theorem 10.47 (Zariski’s Theorem, rephrased). In the above notation, 
dimy(Mp/M>) = dimy(mp/m>) => dim V(p), 


and P is nonsingular if and only if equality holds. The set of nonsingular points 
of Vp) is nonempty and open. 


Toward the proof of this theorem, we showed in Section VII.5 for all P € V(p) 
that 


(a) dim,(Mp/Mp) = dimy(mp/m>), 
(b) dim (mp /mp) + rank [54-(P)] = 21, 


(c)  P isanonsingular point if and only if dimg(mp /m>) = dim V(p). 


In addition, we completed most of the proof in the special case that V(p) is an 
irreducible affine hypersurface by showing that 


(d) dim,(mp/m>) > dimV(p) forall P € V(p), 
(e) dim,(mp/m;) =dimV(p) for some P € V(p). 


Our goal in this section is to complete the proof of Zariski’s Theorem in the general 
case as stated by reducing (d) and (e) for the general case to what has already 
been proved for the special case that V (p) is an irreducible affine hypersurface. 
We need also to see in all cases that the set of nonsingular points is Zariski open. 


Before proceeding, let us mention the significance of Theorem 10.47. The 
definition above of nonsingular and singular points extends immediately to 
quasi-affine varieties, using the same defining polynomials, and the theorem is 
then applicable because the open set of nonsingular points in an affine variety 
meets any nonempty open subset of the variety. In the projective case we can pull 
matters back to affine space by means of one of the maps 6; : A” — P”. In this 
way we obtain definitions of nonsingular and singular point for quasiprojective 
varieties, and the theorem remains valid.!°> What is far from obvious with such 
a definition is that the decision nonsingular vs. singular for a point is unaffected 
by isomorphisms of varieties. On the other hand, the equivalent condition on 
Mp/Mz>, as stated in Zariski’s Theorem is manifestly unaffected by isomorphisms 
of varieties because of Proposition 10.42. 


'5Problems 13-16 at the end of the chapter show that the rank computation can alternatively be 
made directly with the homogeneous polynomials defining the projective variety in question. 


602 X. Methods of Algebraic Geometry 


Proposition 10.48. Any m-dimensional variety is birationally equivalent to 
an irreducible affine hypersurface H in A’"*!, 


PROOF. Let V be the variety in question. By definition of dim V, the function 
field k(V) is a finitely generated extension field of k of transcendence degree 
m over k. Since algebraically closed fields are perfect, Theorem 7.20 shows 
that k(V) is “separably generated” over k, and Theorem 7.18 shows as a con- 
sequence that k(V) has a “separating transcendence basis,” i.e., a transcendence 
basis {x,,..., Xm} such that k(V) is a finite separable algebraic extension of 
k(x, ..-,;Xm). By the Theorem of the Primitive Element, there exists an element 
Xm+1 Of k(V) such that k(V) = k(Q1,...,Xm)[Xm4i]. Let P(Xn+1) be the 
minimal polynomial of x41; over K(x1,...,Xm). Writing out the equation 
P(Xm+1) = O and clearing fractions, we see that x,,+1 satisfies a polynomial 
equation 


a, (X1, shes een ae Sle co? te a(x, ee Xm)Xm+1 + ag(X1, Sas ets Xm) = 0 
in which the coefficient polynomials aj(X1,..., Xm) € k[X1,..., Xm] have no 
nontrivial common factor. In this case the polynomial f(X1,..., Xm+1) equal 
to 

ar (X1, St aherg Xm)Xin41 7 a a(X1, aa Xm)Xm+1 + ag(X1, ee Xm) 


is irreducible in k[X 1, ..., Xm+41]. Thus the principal ideal (f) defines an irre- 
ducible affine hypersurface H = V(f) in A+! whose affine coordinate ring is 
k[X1,..-, Xmai]/(f). The field of fractions k(H#) is isomorphic to k(V), and 
H is birationally equivalent to V by the equivalence of (a) and (b) in Corollary 
10.46. 


Lemma 10.49. Every point P in V(p) has 0 < dimy(Mp/M3) <n, and the 
set of points P in V(p) with dimy(Mp/ M3) > r is a Zariski closed subset for 
each integer r. 


PROOF. The entries of the matrix [24] are polynomials, and the set of points 
J 


a 
being the set on which all (s + 1)-by-(s + 1) minors of the matrix vanish. By 


display formula (b) above, the set of points P for which dim, (mp / wt) >n-—s 
is closed, and (a) therefore shows that the set with dimy(Mp/ M3) >n—sis 
closed. 


P of V(p) for which the matrix [= (P)] has rank < s is a Zariski closed subset, 
J 


PROOF OF THEOREM 10.47. Let m = dim V (p), and let a birational mapping 
of V(p) to an affine hypersurface H of A’"t! be given. By the equivalence of (a) 


7. Zariski’s Theorem about Nonsingular Points 603 


and (c) in Corollary 10.46, there exist nonempty open subsets E of V(p) and F 
of H that are isomorphic as varieties, say by an isomorphism g : E —> F. Since 
m = dim V(p) = dim H, Proposition 10.11 shows that m = dim E = dim F 
also. For each integer r > 0, let 


{P €V(p) | dimy(Mp/M>) <r}, 
{P € E | dimy(Mp/M3) <r}, 
{[P © F CH | dim,(Mp/M3) <r}. 


S, 
T, 
U, 
Lemma 10.49 shows that 
S,, T;, U, are relatively open in V(p), E, F, respectively, for each r. () 
Application of Proposition 10.42 to g and g7! gives 
g(T,.) = U, for allr > 0, (**) 
and the special case of Theorem 10.47 proved in Section VIL.5 shows that 
Um # SO and Um_-1 = @. (7) 
Combining (**) and (+) yields 
Tn FD and Tm—1 = @. (+t) 
Since S, > T,, the first of these shows that 


Sin FD. () 


If Sn-1 4 @, then EM S,—1 4 @ because any two nonempty open subsets of 
V (p) have nonempty intersection; but 7;,_1 = EMS—1 would then be nonempty, 
in contradiction to (¢+). Thus 


S74 = 2B, (+4) 
In view of (a), (£) proves (e) for V(p), and (££) proves (d) for V(p). Because of 


(££), Lemma 10.49 implies that S,, is Zariski open; thus the set of nonsingular 
points is open. 


604 X. Methods of Algebraic Geometry 
8. Classification Questions about Irreducible Curves 


Sections 1-7 give the fundamentals concerning (quasiprojective) varieties over 
the algebraically closed field k. The remainder of the chapter will address aspects 
of three problems: 


(i) What are all varieties, or in what senses can varieties be classified? 
(ii) To what extent can one make computations in the subject? 
(iii) What can be said when the algebraically closed field k is replaced by a 
general commutative ring with identity? 


Algebraic geometry is an enormous subject, going well beyond these problems. 
For example the investigation of the nature of singularities is in itself a large 
subject, with striking applications to topology and differential equations. The 
use of homological methods ties algebraic geometry closely to topology and to 
number theory, and these methods have bearing on the extent to which compact 
complex manifolds admit the structure of projective varieties. Algebraic geometry 
is an ingredient in the subject of invariant theory, which studies classical varieties 
using representation theory. It is an ingredient also in the subject of algebraic 
groups, which concerns varieties with a group structure in which multiplication 
and inversion are morphisms. 

The present section concerns the first of the three problems listed above, and 
we limit our discussion to irreducible curves, i.e., to varieties of dimension 1. 
We say that an irreducible curve is nonsingular if it is nonsingular at every 
point. We are going to show in this section that each birational equivalence 
class of irreducible curves over k contains a nonsingular projective curve and 
that any two nonsingular projective curves in the birational equivalence class are 
isomorphic as projective varieties.'© We also will get some information about 
how this nonsingular curve in the class is related to the other curves in the class. 
To a great extent the classification of irreducible curves will therefore have been 
reduced to the classification of the birational equivalence classes, which Corollary 
10.46 says is the same thing as a classification of the function fields in one variable 
over k. We will not have anything to say about classifying the function fields in 
one variable except to say that each class has a genus, according to Section IX.3, 
and that every nonnegative integer can arise as a genus, according to Example 3 
of genus in Section IX.3.!” 

Chapter IX already contains clues about where to begin. Section [X.1 men- 
tioned the relevance of Dedekind domains to the study, and Problems 5-11 at 
the end of that chapter attached a discrete valuation to each nonsingular point of 
any irreducible affine plane curve. The notions of Dedekind domains, discrete 


'6The exposition in this section is based in part on Chapter 7 of Fulton’s book, Chapter I of 
Hartshorne’s book, Chapter II of Reid’s book, and Volume I by Zariski-Samuel. 
'TThe subject of Teichmiiller theory in effect addresses this question when k = C. 


8. Classification Questions about Irreducible Curves 605 


valuations, and nonsingular points are very closely related, and we begin with 
some equivalences concerning them. Recall from Sections 2 and 4 that the affine 
coordinate ring A(C) of any irreducible affine curve C has Krull dimension 1. 
That is, the Noetherian domain A(C) has the property that every nonzero prime 
ideal is maximal. We have seen that the local ring Op(C) at any point is a 
localization of A(C), namely the localization of A(C) with respect to the maximal 
ideal mp of functions vanishing at P. Furthermore, the proper ideals of such a 
localization are exactly the sets S~'a with a equal to an ideal disjoint from the 
set-theoretic complement of mp in A(C). It follows that every nonzero prime 
ideal in Op(C) is maximal. This conclusion extends to the quasiprojective case 
as a consequence of Proposition 10.33. Zariski’s Theorem in Section 7 shows that 
nonsingularity of the point P of C can be detected from Op(C). Consequently 
the following proposition is relevant. 


Proposition 10.50. Let R be a Noetherian local ring that is an integral domain 
with the property that the only nonzero prime ideal is the maximal ideal. Let M 
be the unique maximal ideal of R, let K be the field of fractions of R, and let 
F = R/M be the quotient field. Under the assumption that M + 0 and therefore 
that R # K, the following conditions on R are equivalent: 


(a) R is integrally closed, 

(b) R is a Dedekind domain, 

(c) R is a principal ideal domain, 

(d) R is the valuation ring relative to some discrete valuation of K, 
(e) M is a principal ideal, 

(f) dime M/M? = 1. 


REMARKS. Consider (f). To see how M/M? becomes an F vector space in a 
natural way, let + M be amember of F’, and let m+ M * be amember of M /M - 
Then (r + M)(m + M*) = rm + M7? is a well-defined scalar multiplication of 
F on M/M’, and M/M7 becomes a vector space over F. Nakayama’s Lemma 
(Lemma 8.51 of Basic Algebra, restated in the present book on page xxv) shows 
that an equality MN = N for a finitely generated R module N is possible only 
if N = 0; since M itself is a finitely generated R module, being an ideal in a 
Noetherian ring, and since M # 0 by assumption, M* = M is not possible. 
Therefore dime M/M? > 1. 


PROOF. If (a) holds, then R satisfies the three conditions (Noetherian, integrally 
closed, every nonzero prime ideal maximal) in the definition of Dedekind domain. 
Thus (a) implies (b). A Dedekind domain with only finitely many maximal ideals 
is a principal ideal domain by Corollary 8.62 of Basic Algebra, and thus (b) implies 
(c). A principal ideal domain is a unique factorization domain by Theorem 8.15 
of Basic Algebra, and thus (c) implies (a) by Proposition 8.41 of Basic Algebra. 


606 X. Methods of Algebraic Geometry 


To see that (a) through (c) are equivalent to (d), first suppose that (a) through 
(c) hold. Then every fractional ideal in K relative to R is of the form M * for 
some integer k. If x 4 0 is in K, then the principal fractional ideal x R is of the 
form xR = MS for some k. Section VI.2 shows that the formula v(x) = k (with 
v(O) = oo) defines a discrete valuation on K, and the definition of v shows that 
the valuation ring of v is R. Hence (d) holds. Conversely if (d) holds, then R is 
a principal ideal domain by Proposition 6.2; thus (c) and necessarily (a) and (b) 
hold. 

Let us prove that (e) and (f) are equivalent. If (e) holds, then we can write 
M = (z) for some z in R. If m+ M7 isa given element of M/M?, then m is 
of the form m = ra for some r in R. Hence (r + M)(x + M*) =ra + M* = 
m+ M?, and dimp M /M 2 < 1. Since the remarks before the proof show that 
dime M/M? > 1, (f) holds. 

If (f) holds, let {77 + M7} be an F basis of M/M?. If m € M is given, then 
m+ M? = (r+M)(a + M’) for somer € R. Therefore m = ra +m’ with 
m' € M, and wesee that (7) + M* = M. We shall apply Nakayama’s Lemma in 
the local ring R/(z) with maximal ideal M/(zr) and with module N = M/(z): 
Given m € M, weexpandm =ra +m withm’ ¢ M*asm=ra+ ey mjmj;. 
Then the equality m+ (77) = baer m;m, in M/(z.) shows thatm = )°; mj a mj, 
hence that the coset m + (sr) lies in )°; (m; + (7))(M/(z)). In other words, 
M/(a) = (M/(st))?. Nakayama’s Lemma shows that M/(sr) = 0, and therefore 
M = (zr). Thus (e) holds. 

Finally let us prove that (c) and (e) are equivalent. If (c) holds, then M has to be 
principal, and hence (e) holds. Suppose that (e) holds, i.e., that M = (zr). Let J 
be a nonzero proper ideal in R. The ideal N = ()¢-, M* isa finitely generated R 
module because R is Noetherian, and it has MN = N. By Nakayama’s Lemma, 
N = 0. Since J C M and since J ¥ 0, there exists a largest integer k > 1 such 
that 1 C M*. Choose y 4 Oin/ with y in M‘ = (x*) but not in M**! = (#1), 
Let us write y = am for some a € R. Since y is not in M‘+! and since R is 
local, a is a unit in R. Hence a~!y = x is in J, and therefore M* = (x*) C IJ. 
Since we arranged that J] C M*, we obtain J = M* = (x*). Thus (c) holds. 


Corollary 10.51. Let C be an irreducible quasiprojective curve over k, and 
let k(C) be its function field. If P is a point of C, then the following conditions 
are equivalent: 

(a) P is anonsingular point, 

(b) Op(C) is the valuation ring of some discrete valuation of k(C) defined 
over k, 

(c) Op(C) is integrally closed. 


PRooF. Let Mp be the unique maximal ideal of Op(C). Zariski’s Theorem 
(Theorem 10.47) shows that (a) holds if and only if dim, Mp/ Mz; = 1. The 


8. Classification Questions about Irreducible Curves 607 


corollary therefore follows from the equivalence of (f), (d), and (a) in Proposition 
10.50, along with the observation that any discrete valuation produced by (d) has 
to be 0 on k*. 


Corollary 10.52. If C is an irreducible affine curve over k with affine coordi- 
nate ring A(C), then the following conditions on C are equivalent: 
(a) A(C) is integrally closed, 
(b) Op(C) is integrally closed for each point P of the curve, 
(c) C is nonsingular. 


PRooF. If A(C) is integrally closed, then Corollary 8.48c of Basic Algebra 
shows that each localization Op(C) is integrally closed. Conversely if each 
Op(C) is integrally closed and if a member f of the function field k(C) is given 
that is a root of a monic polynomial with coefficients in A(C), then f is a root of 
the same polynomial with coefficients in O p(C) and is in Op(C) because O p(C) 
is integrally closed. Corollary 10.25 shows that A(C) = (|p Op(C). Therefore 
f lies in A(C), and A(C) is integrally closed. This proves that (a) and (b) are 
equivalent. The equivalence of (b) and (c) follows from Corollary 10.51. 


We turn our attention to constructing a nonsingular irreducible projective curve 
whose field of rational functions is a given function field K in one variable over 
k. If C is any irreducible quasiprojective curve with k(C) = K, then Corollary 
10.51 associates a discrete valuation of K over k to each nonsingular point of C. 
To get an idea what C must be like if it is to be nonsingular at every point, we 
now prove a theorem in the converse direction, associating a point of the curve 
to each discrete valuation of K over k. 


Theorem 10.53. Let C be an irreducible projective curve with function field 
k(C) equal to K, and let v be a discrete valuation of K defined over k. If R, is the 
valuation ring of v and p, is the valuation ideal, then there exists a unique point 
P on the curve for which the maximal ideal Mp of Op(C) has Mp C py. 


PROOF OF UNIQUENESS. Assume the contrary. If P and Q are distinct points 
with Mp C p, and Mg C fy, then Proposition 10.36 constructs a function h in 
k(C) with h defined at P and Q, h(P) = 0, and h(Q) # 0. This function h 
is in Mp, and h — h(Q) is in Mg. The assumed inclusions of maximal ideals 
imply that v(h) > 1 and that v(h — h(Q)) = 1. On the other hand, h(Q) 4 0 
implies that v(h(Q)) = 0. Thus 0 = v(h(Q)) = min (v(h(Q) — h), v(h)) = 1, 
contradiction. 


PROOF OF EXISTENCE. It is shown in Problem 12 at the end of the chapter that 
any projective variety in P” is isomorphic to a projective variety V in some P” 
with n <r such that V is not contained in any subvariety {[xo, wey Xn] | xp = o} 


608 X. Methods of Algebraic Geometry 


with 0 < j <n. That being so, we may assume that C is a projective variety 
in P” and that C/N £;(A") ¢ @ forO < j <n, where f; : A” — P” is the 
embedding defined after Proposition 10.18. Let A(C) = k[Xo,..., Xn]/1(C) 
be the homogeneous coordinate ring of C, and for each j, let x; be the image of 
X; in A(C). Since I(C) does not contain Xj, x; is not the 0 element of A(C). 
Since X; and X; are homogeneous of the same degree, each function x;/x; is a 
well-defined member of the function field k(C). 

Let N = max; ; v(x;/x;). Possibly by renaming some coordinate xj, as Xo, 
we may assume that v(x;,/x9) = N for some ip. Then we have v(x;/xo) = 
V(Xip /X0) + U(Xj /Xig) = N — v(xi,/xi) = 0 for alli. Consequently each function 
x; /xo lies in the subring R, of k(C). 

Theorem 10.20 and Corollary 10.22 show that Co = By aCe ) is an irre- 
ducible affine curve and that its prime ideal is [(Co) = Bo (C)). Conse- 
quently the substitution homomorphism Bi : kLXo0,..., Xn] > kKLX1,..., Xn] 
descends to a homomorphism of A(C) = klXo, ...,Xn]/1(C) onto A(Co) = 
k[X1,.-., Xn]/7(Co) that carries xo in A(C) to | and carries the members 
X1,...,Xn Of A(C) to the generators of A(Co). The members x;/xo0 of k(C) 
therefore get identified with the generators of A(Co), and we conclude that 
A(Co) © Ro. 

Define q = p, M A(Co). This is a prime ideal of A(Co), and it pulls back 
under the quotient homomorphism k[X,,..., X,] — A(Co) to a prime ideal 
containing [(Cy). Then V(q) is an affine subvariety of Co. Since dim Cop = 1, 
there are only two possibilities. One is that dim V (q) = 1, in which case V(q) = 
Co, ¢ = I(Co), and q = O. The other is that dim V(q) = 0, in which case 
V(q) = {P} for some point P that necessarily lies on Co. In the first case, v 
is 0 on every nonzero member of A(C) and hence is 0 on K(C)*, contradiction. 
Thus we are in the second case. Then @ is maximal in k[X,,..., Xn], q is 
maximal in A(Co), q is the ideal mp of all members of A(Co) vanishing at P, 
and A(Co)/q = k. If S denotes the set-theoretic complement of q in A(Co), then 
no member of S$ can be in p, because then q + kl = A(Cp) would be in p,, 
contradiction. Thus v(s) = 0 for all s € S, and Mp = S~'mp C py. 


Corollary 10.54. If ¢ is a rational map from an irreducible curve C’ to an 
irreducible projective curve C, then the largest domain on which ¢ is a morphism 
contains every nonsingular point of C’. If C’ is nonsingular, then g is a morphism 
from C’ into C. 


PRooF. If g is not dominant, then Problem 6 at the end of the chapter shows that 
gy is constant. Certainly the largest domain on which a constant g is a morphism 
is C’. 

Thus suppose that g is dominant. Using the notation introduced early in 
Section 6, let @ : k(C) + k(C’) be the associated field map of function fields. 


8. Classification Questions about Irreducible Curves 609 


Since k(C) and k(C’) both have transcendence degree | over k and since k(C) is 
finitely generated as a field over k, the field k(C’) is a finite algebraic extension 
of the field @(k(C)). If v is any discrete valuation of k(C’), then it follows from 
the finiteness of this extension that v cannot be identically 0 on @(k(C))”*; in fact, 
if it were identically 0, then the expansion x = ae cjx; of a general element 
x of k(C’) in terms of a vector-space basis {x;, .. - Xm} of k(C’) over G(kK(C)) 
would yield the inequality v’(x) > min; v(x;), which cannot be true for all x. 
Meanwhile, if P is a nonsingular point of C’, then Corollary 10.51 shows that 
Op(C’)is the valuationring R, for some valuation v of k(C’) overk. The maximal 
ideal Mp of Op(C’) equals the valuation ideal p, of v. Since the restriction of v to 
@(k(C))~ is not identically 0, the restriction comes from some positive multiple e 
of a discrete valuation on @(k(C)). Let vp be the corresponding discrete valuation 
of k(C); this is given by v9(f) = e~!v(@(f)). Let Ro be its valuation ring and 
fo be its valuation ideal in k(C); the latter is given by pp = @ '(py). Theorem 
10.53 shows that there exists a unique point Q on the curve C such that the 
maximal ideal Mg of Og(C) is contained in po. That is, Mg C po = @ | (py). 
Application of @ gives (Mo) © GG"! (pv) C py = Mp. Theorem 10.45 shows 
that consequently P is in the largest domain on which ¢ is a morphism and that 


g(P) =Q. 


Corollary 10.55. If two nonsingular irreducible projective curves are bira- 
tionally equivalent, then they are isomorphic as varieties. 


PRooF. This follows by applying Corollary 10.54 twice. 


Corollary 10.56. If C is a nonsingular irreducible projective curve with 
function field K = k(C), then the points of C are in one-one correspondence 
with the discrete valuations of KK defined over k. 


PROOF. This is the correspondence given in one direction by Corollary 10.51 
and in the reverse direction by Theorem 10.53. 


Corollary 10.56 has a remarkable conclusion, but the corollary assumes the 
existence of a nonsingular projective curve, which we have not yet proved. In more 
detail we now know that a nonsingular point P of any irreducible projective curve 
C picks out a unique discrete valuation v of the function field K = k(C), namely 
the one whose valuation ring is given by R, = Op(C), and that conversely when 
C is projective, any discrete valuation v’ defined over k picks out a certain point P’ 
of C with the property that Op:(C) C R,. If P is nonsingular and we go through 
the first step and then the second, using v’ = v, we obtain Op (C) C Op(C). 
Proposition 10.36 shows that P’ = P, and hence the second process inverts the 
first. That is what Corollary 10.56 says. Also, we know from Theorem 10.47 that 
many discrete valuations are involved in this process, since the set of nonsingular 


610 X. Methods of Algebraic Geometry 


points of a variety is Zariski open. What we do not know is that any given discrete 
valuation over k ever yields a nonsingular point for any curve with the function 
field K. This missing piece of information will be supplied in Corollary 10.58 
below. To prove Corollary 10.58, we shall make use of the following theorem, 
which we need only in the case that the field & is our algebraically closed field k. 
We postpone the proof of the theorem for a moment, and when we give the proof, 
we shall give it only for the case that the field k in the statement is algebraically 
closed. 


Theorem 10.57. Let k be a field, let R = k[x,,..., x,] be a finitely generated 
integral domain over k, let K be the field of fractions of R, and let L be a finite 
algebraic extension of K. Then the integral closure T of R in L is a finitely 
generated R module. 


Corollary 10.58. Let C be an irreducible projective curve with function field 
K = k(C), let P be a point of C, and let Mp be the maximal ideal of Op(C). 
Then there exists a discrete valuation v of K defined over k whose valuation ideal 
p, has Mp C py. 


REMARKS. This result is a supplement to Theorem 10.53. It says that the map 
of that theorem, carrying discrete valuations of K defined over k to points of C, 
is onto. 


PROOF. Without loss of generality, we may assume that C is affine. Let mp be 
the maximal ideal in the affine coordinate ring A(C) consisting of all functions 
vanishing at P, and let S be the set-theoretic complement of mp in A(C), so 
that Mp = S~'mp. Evaluation at P is a linear functional on A(C) with kernel 
mp, and therefore A(C) = mp + k1. In other words, mp and any element of S 
together generate A(C) as a k vector space. 

If T denotes the integral closure of A(C) in K, then Theorem 10.57 implies that 
T is Noetherian, and Proposition 8.45 of Basic Algebra shows that every nonzero 
prime ideal of T is maximal. Hence T is a Dedekind domain. Proposition 
8.53 of Basic Algebra shows that there exists a maximal ideal q of T such that 
mp = A(C)q. Since T is a Dedekind domain, q is contained in the valuation 
ideal p, of a unique discrete valuation v of K, and T is contained in the valuation 
ring T, of v. Thusmp C py, and § C T implies that v(s) > Oforalls € S. Onthe 
other hand, | lies in mp + ks for any s in S, and hence 0 = v(1) > min(1, v(s)). 
Therefore v(s) = 0 for alls € S,and Mp = S~'mp C p,. 


Corollary 10.59. If K is a function field in one variable over k and if v is a 
discrete valuation of K defined over k with valuation ring R,, then there exists 
an irreducible nonsingular affine curve C over k with function field IK and with a 
point P such that Op(C) = Ry. 


8. Classification Questions about Irreducible Curves 611 


PROOF. Choose an element x of KK such that v(x) > 0. Define R = k[x]. 
Since v(x) 4 0, x is transcendental over k, and K is a finite algebraic extension 
of the field of fractions k(x) of R. Corollary 7.14 shows that the integral closure 
T of R in Kis a Dedekind domain, and Theorem 10.57 shows that T is a finitely 
generated R module. Thus we can write T as T = k[x1,...,x,] with x; = x. 
The substitution homomorphism with X; +> x; for all j carries k[X1,..., Xn] 
onto T and has a prime ideal p as kernel, since T is an integral domain. Thus 
V(p) is an affine variety with T as its affine coordinate ring. The dimension of 
V (p) is the transcendence degree of K over k, which is 1 by assumption. Thus 
C = V(p) is an irreducible curve. Since T is integrally closed by construction, 
Corollary 10.52 shows that C is nonsingular. 

Let R, C K be the valuation ring of v, and let p, be the valuation ideal. The 
inequality u(x) > 0 shows that v is > 0 on R = k[x], and Proposition 6.7 says 
that v is consequently > 0 on the integral closure T of R in K. In other words, T 
is contained in R,. Since T is a Dedekind domain and K is its field of fractions, 
Theorem 6.5 shows that q = p, 1 T is a nonzero prime (= maximal) ideal of T 
and that the discrete valuation vg of K over k determined by q coincides with v. 
The maximal ideals of the affine coordinate ring of an affine variety correspond 
to the points of the variety by Proposition 10.23, and thus there exists a point P 
of C such that q is the maximal ideal of T consisting of all functions vanishing 
at P. The localization of T with respect to q is Op(C) by definition and is R, by 
Proposition 6.4. Therefore Op(C) = R,. 


Corollary 10.60. Let C be the irreducible nonsingular affine curve constructed 
in Corollary 10.59 and having function field K = k(C), and regard C as a 
subvariety of its projective closure C. Then there are only finitely many discrete 
valuations v’ of K defined over k such that the unique point P of C with Mp C py, 
where Mp is the maximal ideal of Op(C) and py is the valuation ideal of v’, lies 
outside C. 


PROOF. We go over the argument in Corollary 10.59 with the same element 
x and with any discrete valuation v’ defined over k such that v’(x) > 0. This 
inequality implies that v’ is > 0 on k[x], and Proposition 6.7 then shows that v’ is 
> OonT = A(C). Thus A(C) is contained in the valuation ring R, of v’. Define 
q = Py M A(C). Arguing as in the existence proof for Theorem 10.53, we find 
that q equals the ideal mp of all members of A(C) vanishing at a certain point 
P of C, and that proof then shows that Mp C p,. By uniqueness in Theorem 
10.53, this P is the one and only point produced by that theorem. 

In other words, the only discrete valuations v’ of K defined over k for which 
the point P lies outside C are those with v’(x) < 0. Corollary 6.10 shows that 
there are only finitely many of these. 


612 X. Methods of Algebraic Geometry 


We come to the proof of Theorem 10.57, but only under the assumption that k is algebraically 
closed. The proof is rather technical, and the reader is encouraged to skip it on first reading. To 
underscore this point, the proof appears in small print. We need two lemmas. 


Lemma 10.61. Let R be a Noetherian integrally closed domain with field of fractions F, let K be 
a finite separable extension of F,, and let T be the integral closure of R in K. Then T is Noetherian 
and is finitely generated as an R module. 


Proor. In effect, this result was proved in Basic Algebra. In more detail: With the above 
assumptions and also the assumption that every nonzero prime ideal of R is maximal (i.e., that R 
is a Dedekind domain), the proof of Theorem 8.54 of Basic Algebra showed that T is a Dedekind 
domain. The hard part of that proof appeared in Section [X.15; it showed from the separability that 
T is finitely generated as an R module, and it did not make use of the assumption that every nonzero 
prime ideal of R is maximal. Since T is finitely generated and R is Noetherian, every R submodule 
of T is a finitely generated R module, by Proposition 8.34 of Basic Algebra. In particular, every 
ideal of T is finitely generated as an R module and therefore is finitely generated as a T module. 
Consequently T is Noetherian. 


Lemma 10.62 (Noether Normalization Lemma). Let k be an infinite field, let R = k[x,,..., Xn] 
be a finitely generated integral domain over k, and let K = k(x1,..., X,) be the field of fractions of 
k. Then for a suitable d with 0 < d < n, there exist d linear combinations y,,..., yg of X1,...,Xn 
with coefficients in k such that yj, ..., yg are algebraically independent over k and such that every 
element of R is integral over k[)1,..., yg]. If K is separably generated over k, then the y; may be 
chosen in such a way that K is a separable extension of k(y1,..., ya). 


REMARKS. It is immediate from the conclusion that d is the transcendence degree of K over k. 
The lemma is a result about the extension of rings that improves upon Theorem 7.7 for fields; the 
latter says that every field extension can be accomplished by a transcendental extension followed by 
an algebraic extension. The present lemma says that the passage from a field to a finitely generated 
integral domain can be accomplished by a full polynomial extension followed by an extension in 
which each generator is not merely algebraic but actually is a root of a monic polynomial with 
coefficients in the full polynomial ring. 


PrRooF. Let J be the kernel of the quotient homomorphism k[X1,..., Xn] > k[x1,..., Xn]. 
The core of the proof involves a single nonzero f in J. The idea is to replace X1,..., Xn—1 by new 
indeterminates X a hens Xi) to make the equation f (x1, ...,X,) = 0 become a monic polynomial 
equation satisfied by x, over R’ = k[X},..., X/,_,]. Withc1, ..., cn—1 equal to members of k to be 
specified later, define x = Xj —cjxXy for 1 < j <n—1. The equation f(x1,..., xn) = 0 becomes 


f(%] #eiXn, xh + Cn-1Xn, Xn) = 0. (*) 
For a suitable choice of c1, ..., Cn—1, we Shall show in a moment that 
the polynomial f(X)+¢)Xn,..., Xj +¢n—-1Xn, Xn) is monic in X, (2) 


after multiplication by a member of k™. 

Assuming (*), let us see how the first conclusion of the lemma follows by induction on n. For 
n = 1, there are two cases. One case is that K is a simple algebraic extension field of k, and then 
every element of the extension field R = K is a root of its minimal polynomial over k. This is the 
case d = 0. The other case is that K is a simple transcendental extension, and then we can take 
y, = x1. This is the cased = 1. 

For the inductive step, assume the first conclusion of the lemma for n — 1 > 1, d being an integer 
withO <d <n-—1. If J =0, there is nothing to prove, since x;,..., Xx, are then algebraically 


8. Classification Questions about Irreducible Curves 613 


independent and the lemma follows with d = n and with y; = x; forl < j <n. If] #0, fix f £0 


in J, and choose c1,..., Cn—1 in k to make (**) hold. Then (*) shows that x, is a root of a monic 
polynomial with coefficients in R’ = k[x},...,x/_,]. By the inductive hypothesis we can choose 
members yj,..., y/, of R’ with 0 < d < n—1 such that yj,..., y/, are algebraically independent 
over k and such that every element of R’ is integral over k[y},..., y/]. By transitivity of integral 
dependence, every element of R’[x,] is integral over k[y;,..., y/,]. Since the definition of x} in 
terms of x; shows that R’[x,] = k[x{},..., 4, Xn] = k[41, ---, Xn—1, Xn] = R, every element of 
R is integral over k[yt, oa yyl- This completes the induction, and the first sentence of conclusions 


of the lemma is proved except for (*«). 
To prove (**), letr = deg f, and write f = h, + g with h, nonzero and homogeneous of degree 
r and with deg g <r — 1 (org = 0). Then 


F(X, ees Xn) ry F(X +¢Xn, Lala, + €n—1Xn, Xn) 


=h,(c1Xn,..., Cn—1Xn) + (terms involving 1, Xn, X?, Se sty Xl) 


wget): 
Thus () is proved if cy, ..., C,—1 can be chosen with the scalar h,(c,,..., Cn—1, 1) notO. Here the 
fact that h, is nonzero and homogeneous implies that h; (X1,..., Xn—1, 1) is not the 0 polynomial 
in k[X,,..., X,-1]. Since k is an infinite field, Corollary 4.32 of Basic Algebra shows that the 
evaluation mapping of k[X1,..., Xn—1] into the algebra of functions from k"—! into k is one-one, 
and therefore there exist cj, ..., Cn—1 with h,(c1,...,Cn—1, 1) #0. This proves («*). 

We are left with proving that if K is separably generated over k, then the y; may be chosen with 
K separable over k(y1,..., yq). We proceed as above but with an amended version of (+) that we 
mention in a moment. In the induction the extra hypothesis for n = 1 is that either x; is separable 
algebraic over k or x; is transcendental, and in both cases K is a separable extension of k(1). 
For the inductive step when J ~ 0, Theorem 7.18 shows that {x1,...,x,} contains a separating 
transcendence basis; possibly by renumbering the variables, we may assume that this transcendence 
basis is a subset of {x}, ..., X,—1}. In particular, x, is separable algebraic over k(x], ..., Xn—1). For 
the polynomial f, we start from the minimal polynomial of x, over k(x1,..., Xn—1), next multiply 
by acommon denominator to get all coefficients of powers of X, to be ink[x1,..., X,—1], and then 
replace the occurrences of x1, ...,Xn—1 by X1,..., Xn—1. The result is f. We choose Ve vdieo yy 
as above, and the inductive hypothesis shows that k(x}, ..., xj) is separable over k(yj,..., 7). 
If we can show that x, is separable over k(x}, sted xi a) then we will have proved that K is a 
separable extension of k(y;, ..., y/,) because of the transitivity of separability. So the induction will 
be complete. 

To get that x, is separable over k(x;,...,x,_,), it is enough to prove that we can arrange for 


=hy(c1,..-,Cn—1, 1)X}, + (terms involving 1, Xn, x2 


noe 


Xn, to bea simple root of fy +c1Xn, ee ee + cn—-1Xn, Xn) () 


in addition to (*«). Indeed, then x, is a root of a separable polynomial over k(x}, ess X34) and 
hence is a separable element over k(x}, went Xa). The condition (+) is the same as the condition 
that the derivative of (+) with respect to X,, when evaluated at x,, be nonzero. Thus we want to 
arrange that 

Sn (X1, 6-5 Xn—15 Xn) Her fie, «6.5 Xn—-1, Xn) $+ + + en—1 fn—-1 1, «--, Xn—1, Xn) FO, (FF) 
where the subscripts on f indicate first partial derivatives in the indicated variables. The left side 
of (+7) is the sum of a constant and a linear functional on the vector space of all (c1,..., Cn—1) in 
k"-!_ The constant term is Sn(%1,--+,;Xn—1, Xn), which is nonzero because x, is separable over 
k(x1,..., X,—1) and is therefore a simple root of its minimal polynomial over k(x,, ..., X,—1). Thus 
the left side of (++) is the value of a nonzero polynomial p(X1,..., Xn-1) = an + Ba ajXj 
at (c],...,Cn—1). Consequently (*«*) and (f+) will hold simultaneously if we choose a point 
(C1,---,€n—1) in k"—! at which the nonzero polynomial p(X1,..., Xn—1)hr(X1,..., Xn—1, 1) 1s 
not zero. 


614 X. Methods of Algebraic Geometry 


PROOF OF THEOREM 10.57 UNDER THE ASSUMPTION THAT k IS ALGEBRAICALLY CLOSED. 
The first step is to reduce to the case that L = K, ie., that the field of fractions of R coincides with 
L. To do so, choose a vector-space basis {z,,..., z-} of L over K consisting of elements integral 
over R; this is possible by Proposition 8.42 of Basic Algebra. Put S = R[z1,...,z,-]. This isa 
finitely generated integral domain over k, all of its elements are integral over k, and it has L as field 
of fractions. The integral closure of R in L equals the integral closure of S in L. 

Thus we may assume that R = k[x,,..., x,] is an integral domain with field of fractions K and 
that we are to prove that the integral closure T of R in K isa finitely generated R module. Let d be 
the transcendence degree of K over k. Since algebraically closed fields are perfect, Theorem 7.20 
shows that K is separably generated over k. Lemma 10.62 is therefore applicable, and it produces 


d linear combinations y;,..., yg Of X1,...,X, over k such that the subring S = k[y,..., yg] of 
R is a full polynomial ring, every element of R is integral over S, and K is a separable extension 
of the field k(y,..., ya). Since every element of T is integral over R, the transitivity of integral 


dependence implies that every element of T is integral over S. Therefore T is the integral closure 
of S in K. Being a full polynomial ring, S is Noetherian and is a unique factorization domain; the 
latter property implies that S is integrally closed, according to Proposition 8.41 of Basic Algebra. 
Taking S to be the Noetherian integrally closed domain in Lemma 10.61, we see that T is finitely 
generated as an S module. Since S C R, T is certainly finitely generated as an R module. 


Now we come to the main theorem of this section. 


Theorem 10.63. Every birational equivalence class of irreducible projective 
curves contains a nonsingular such curve, and this curve is unique within the 
equivalence class up to isomorphism of varieties. Any irreducible nonsingular 
quasiprojective curve is isomorphic to an open subvariety of some irreducible 
nonsingular projective curve. 


REMARKS. The new content of the theorem is the existence of the nonsingu- 
lar projective curve. The uniqueness is immediate from Corollary 10.55. The 
statement about nonsingular quasiprojective curves is a formality: Such a curve 
Co is birational to the nonsingular projective curve C produced by the theorem 
and also to the projective closure Cy of Co. The birational maps from Co into 
C and from C into Co yield morphisms from Co into C and from C into Co by 
Corollary 10.54; sorting out these morphisms shows that Co is isomorphic to an 
open subvariety of C. 


The idea for proving the existence of the projective curve in the theorem is to 
start with any function field K in one variable over k, take any discrete valuation 
v of K defined over k (these exist as a consequence of Section VI.2), and use 
Corollary 10.59 to obtain some irreducible nonsingular affine curve having K as 
function field and having its local ring at some point equal to the valuation ring of v. 
Corollary 10.60 shows that except for finitely many discrete valuations, we have 
associated a nonsingular point on some irreducible affine curve in the birational 
equivalence class to each discrete valuation of K defined over k. Applying 
Corollary 10.59 to each of these exceptional discrete valuations, we end up witha 
finite set of irreducible nonsingular affine curves such that each discrete valuation 


8. Classification Questions about Irreducible Curves 615 


of K over k corresponds to some point of at least one of the curves. We shall 
glue together these irreducible nonsingular affine curves in a suitable fashion to 
obtain the desired irreducible nonsingular projective curve. 

The proof makes use of the fact that the product of two projective varieties 
is a projective variety and that morphisms behave as one might expect. Let us 
postpone the details of establishing a rigorous theory of product varieties, going 
right to the proof of Theorem 10.63. 


PROOF OF THEOREM 10.63. Let K be the given function field, and let C1,..., Cm 
be the irreducible nonsingular affine curves described two paragraphs before this 
paragraph. In each case the function field of the curve is isomorphic to K by 
some fixed isomorphism, but we shall treat this fixed isomorphism as if it were 
the identity in order to avoid unnecessary complications in the notation. Let Vx 
be the set of discrete valuations of K defined over k. For v € Vx, we write 
Ry C K for the valuation ring of v and p, for the valuation ideal of v. 

For definiteness let C; be an affine variety in A‘, and let C1,..., Cy be the 
respective projective closures of C1,...,Cm in PS. For any point P in Cj, let 
Mp be the maximal ideal of the local ring Op (Cj). 

Theorem 10.53 gives us for each j a well-defined function y; : Vx > (or and 
Corollary 10.58 says that y; is onto C;. The defining property of y;(v) is that 
My,v) & Pv, and it follows that Ov(v) (Cj) C R,. Corollary 10.51 shows that the 
inverse image under y; of any point in C; is a singleton set, and Corollary 10.60 
shows that the inverse image of any point of the complementary set C; — Cj 
is a finite set. Let F be the finite subset F = ie ¥, (G; — Cj) of Ve. 
For v ¢ F, y;(v) is a nonsingular point of C;, and Corollary 10.51 shows that 
Oy(v)(Cj) = Ry. Hence also My) = Py. The construction of the curves 
Ci,..., Cj was arranged in such a way that 


each v € Vx has y;(v) in C; for some j. (*) 


Let U; be the open set of C; given by Uj = yj(Vk — F). The curves C; are 
birationally equivalent because they all have K as function field, and Corollary 
10.54 shows that the largest domain on which the birational map from Cj; to C; 
is a morphism includes all the nonsingular points of C;. In particular, it contains 
U; = yj(Vk — F). If yg; is the morphism from U; into Cj, then Proposition 
10.42 shows that g; induces a homomorphism Gj. p: Oo, P) (C1) > Op (C;) for 
P € Uj. By assumption, the isomorphism ¢; : k(C;) — k(C;) is normalized to 
be the identity. Since @; is the field mapping corresponding to the birational map 
¢;, 9; is an extension of Gj. p- Thus Qj. p is the identity under our identifications: 
Op, p)(C1) = Op(C;) for P € Uj. Let P = yj(v) with v in Vx — F, and let 
gj(P) = 11 (v’) with v’ in Vx. Then Ry = O),(v) (Cj) = Op, p)(C1) C Ry, and 


616 X. Methods of Algebraic Geometry 


it follows that v’ = v. In particular, v’ is in Vk — F, and y;(v) = gj(yj(v)). 
Hence 


goyy i: Ve-FoU, is independent of /, 


and gj :U; > Uy is an isomorphism. 


The product W = C, x --- x C,, is an m-dimensional closed subvariety of 
PH x... x Pk», which in turn is a projective variety in P’ for a suitably large N. 
For 1 < j <m, let x; : W > C; be the j™ projection map; this is a morphism. 
The set U; x --- x U,, is an open subvariety of W, and the “diagonal” 


NRHP) =H (PG, PG, (Py) ||P eit 


of U; x --- x Uy is an irreducible curve isomorphic to U;. The closure C = A 
is an irreducible projective curve. It is a closed subvariety of W, and it has A as 
an open subvariety. The curve A may be identified with U, via the projection z,, 
and we may therefore identify the function field of A, which is the same as the 
function field of C, with K. 

We shall show that C is nonsingular. For each j, the restriction 2; : C > C; is 
a morphism, and the image contains all points 7;(6(P)) = g;" (P) with P € U;. 
Hence it contains U;, which is an open subset of C;. In other words, 7; : C > 


C; is a dominant morphism. For P € U), we have 7;(6(P)) = yg; '(P). If 


QO = 4(P), this says that 7;(Q) = a (Q), from which it follows that 
5 0 gj is a two-sided inverse of zr; on A. Consequently the dominant morphism 
mj : C > C; isa birational map. Let (V;, y;) be a pair in the class of the rational 
map aa we may assume that V; is the largest domain in C; on which nm! isa 
morphism. 

Let P be any point of C, and let Mp be the maximal ideal of Op(C). Corollary 
10.58 shows that there is amember v of Vx such that Mp C p,. Choose j = j(P) 
with 1 < j < msuch that y;(v) is in C;. Since every point of C; is a nonsingular 
point by construction, Corollary 10.54 shows that every point of C; lies in the 
domain V; on which 7; is defined as a morphism inverting zr;. Consequently the 
open subvariety 7 a (C;) of C is isomorphic to the nonsingular irreducible affine 
curve C;, and the point P of C has an open neighborhood of nonsingular points. 
Since P is arbitrary, C is nonsingular. 


The remainder of this section develops a small theory of products of varieties 
in projective spaces. Most of the proofs are left to the problems at the end of the 
chapter. It is enough to handle the product of two varieties because general finite 
products of varieties can then be treated by induction. 

We begin with the product of two projective spaces. Let m > 1 andn > 1 be 
integers, and put N = (m+ 1)(n+1)-—1=mn+m-+n. We shall exhibit 


8. Classification Questions about Irreducible Curves 617 


P” x P” as a projective variety in P’. To do so, we coordinatize P”, P”, and P% 
by using x;, yj, and w;; forO <i < mandO < j <n. Then 


Pere TXiiy es Aral PS [ores nl 


and 
N 
PY = { [woos Boiss. 2) Wana Wael} 


The Segre embedding is the function 


o([xo, es s§ Xml, [yo. eu ynl) = [xoyo. X01; ett Xm Yn—-1> XmYnl, 


Le., wij = xiyj. Define a C k[Woo,..., Wnn] to be the homogeneous ideal 
generated by all W;;Wx — Wii W;;. Problems 17—19 at the end of the chapter 
show that o is well defined and one-one, that the image of o is V(a), and that 
V (a) is irreducible. Thus the Segre embedding exhibits P” x P” as a projective 
variety in P’. This variety is known as a Segre variety.!* 

Let U C P” and V C P" be projective algebraic sets. Then the Segre 
embedding o carries U x V toasubset of P’, and we wish to see that o(U x V) is 
a projective algebraic set in P’. Let us use the abbreviation X = (Xo, ..., Xm). 
If a = (ao,...,@m) is an (m + 1)-tuple of nonnegative integers, we define 
|a| = a@o+---+a, and X® = D6 -++ Xam We define Y, 8, |B|, and yé similarly. 
Any monomial X°Y* with |a| = d and |B| = e is said to be bihomogeneous of 
bidegree (d, ec). A bihomogeneous polynomial of bidegree (d, e) is any linear 
combination of bihomogeneous monomials of bidegree (d, e). 

The first observation is that any projective algebraic set S in P” can be described 
as the locus of common zeros of a vector space of homogeneous polynomials in 
X of a fixed degree. In fact, we know that S is given by the locus of common 
zeros of a finite set of homogeneous polynomials F(X), ..., F(X) of various 
degrees d|,...,d,. Let us say that d = max; d;. The point is that S is given 
by the locus of common zeros of a finite set of homogeneous polynomials all of 
degree d. The reason is that the locus of common zeros of F(X) is the same 


as the locus of common zeros of xed F(X), ..., ee a F;(X). The assertion 
about describing S$ follows. 

Now let U C P” be the locus of common zeros of homogeneous polynomials 
F\(X),..., F(X) all of degree d, and let V C P” be the locus of common zeros 
of homogeneous polynomials G,(Y),...,G,(Y) all of degree e. Then U x V 
is the locus of common zeros of the bihomogeneous polynomials Fy, (X)Gp(Y), 
all of bidegree (d, e). These cannot immediately be expressed in terms of the 
polynomials W;; of the Segre embedding. However, if we use the same trick 
again, we can substitute the W;;’s. Specifically suppose that d < e. Replace 


'81¢ we form the (m + 1)-by-(n + 1) matrix whose (i, pe entry is W;;, then an equivalent 
description of the Segre variety is as the locus of common zeros of all 2-by-2 minors of this matrix. 


618 X. Methods of Algebraic Geometry 


F\(X),..., F,(X) by a family of r(m + 1) polynomials F/(X),..., Fe ms1y(X) 
homogeneous of degree e. Then the polynomials F’(X)G;(Y) are bihomo- 
geneous of bidegree (e,e). When such a polynomial is expanded as a linear 
combination of monomials, each monomial has e factors from among Xo, ..., Xm 
and e factors from among Yo, ..., Y,. We can pair the factors in whatever fashion 
we want and replace X;Y; by W;;. In this way our system of bihomogeneous 
polynomials can be rewritten as a system of polynomials H,,(W), together with 
the convention that W;; = X;Y;. Theno(U x V) is the locus of common zeros in 
P’ of the polynomials H,,(W) and the defining polynomials of the Segre variety. 

Conversely if we have a projective algebraic set in P’, then its intersection 
with the Segre variety can be described as the locus of common zeros in P” x P” 
of a family of bihomogeneous polynomials in (X, Y). We have only to take the 
defining homogeneous polynomials H(W) and substitute the definition W;; = 
XiY; for Wij. If H(W) is homogeneous of degree e, then the result of the 
substitution is a polynomial bihomogeneous of bidegree (e, e). 

Problems 20-21 at the end of the chapter show that if U and V are irreducible 
closed sets in P” and P”, respectively, then o(U x V) is irreducible in P’. Thus 
we can meaningfully speak of projective varieties in P” x P”. The same pair of 
problems addresses what happens for quasiprojective varieties, showing that o of 
any relatively open subset of a projective variety in P” x P” is a quasiprojective 
variety in P%. 

Now that the notion of variety is meaningful in P” x P", with an interpretation 
in P’,, we can similarly translate definitions and facts about morphisms to make 
them apply in P” x P”. In particular, the projection of a variety to either factor 
P” or P” is a morphism on the variety. If U is a quasiprojective variety and if 
g,; : U — P” and g : U — P" are isomorphisms of U onto quasiprojective 
varieties in P” and P”, then the diagonal A = {(g)(u), g2(u)) | u € US} isa 
quasiprojective variety in P” x P”, and the pair (¢1, g2) is an isomorphism of 
varieties. These matters are discussed in Problem 22 at the end of the chapter. 


9. Affine Algebraic Sets for Monomial Ideals 


Sections 9-12 in part address aspects of the question of how much one can 
make explicit computations with affine and projective varieties. As a general 
tule, the tool for such computations is the theory of Grdbner bases, which were 
introduced in Sections VIII.7—VII.10. The topic is an active area of continuing 
research.!? One can think of immediate problems — suchas finding the dimension 
of an algebraic set, determining the radical of an ideal when the ideal is given, 


!°The book edited by Buchberger and Winkler contains a number of expository “tutorials” that 
give an idea of the breadth of applications of the theory. The book contains also a certain number of 
research papers. 


9. Affine Algebraic Sets for Monomial Ideals 619 


and deciding whether an ideal is prime. We shall concentrate on just one such 
problem, that of finding the dimension.” 

Part of the abstract theory in this case dates back to Hilbert, but in combination 
with the theory of Grébner bases it becomes easier to establish and relatively easy 
to implement computationally.*! We shall prove in Section 12 as a consequence 
of this investigation the deep theorem that a system of simultaneous homogeneous 
polynomial equations having more equations than variables always has a nonzero 
solution.” 

Hilbert associated a polynomial in one variable, now known as the “Hilbert 
polynomial,” to each ideal of polynomials over an algebraically closed field. 
This polynomial encodes certain algebraic information about the ideal, and some 
features of this polynomial depend only on the geometry of the zero locus. In 
particular, the degree of the polynomial turns out to equal the geometric dimension 
of the zero locus, and that will be what interests us. 

The theory behind Grébner bases enables one to reduce the theory of the 
Hilbert polynomial to the case of a monomial ideal, for which it is relatively easy 
to understand.”> We begin with that case in this section. 

Let k be an algebraically closed field, consider affine space A”, and let a be 
an ideal in A = k[X,,..., X,]. In this section we shall be interested in the case 
that a is generated by monomials, in which case it is called a monomial ideal. 
The structure of monomial ideals is captured by Lemma 8.17, which says about 
such an ideal a that 


e for any polynomial f 4 0 in a, each monomial term contributing to f 


lies in a, 

e ahas a finite set of monomials as generators, 

e if {M,,..., M;} is a set of monomials that generate a and if M is any 
monomial in a, then some M; divides M. 

Let e1,..., @, be the standard basis of A”, and let (e;,,..., @;,) be the lin- 
ear span of e;,,...,@;,.. The vector space (e;,,..., €;,) is called a coordinate 
subspace of A”. The ideal py = (X1,..., Xx) in A is prime, and its va- 
riety is V(px) = (€x41,---,@n). Since po C pir C --- C py is a strictly 


increasing sequence of prime ideals in A and since A has Krull dimension n, 


0Solutions to the other two problems are known as well. References may be found in Cox— 
Little-O’Shea. For determining the radical, see p. 177. For deciding whether an ideal is prime, see 
p. 207. 

2! The exposition in Sections 9-12 is based in part on Chapter 9 of the book by Cox—Little-O’ Shea 
and in part on Chapter I of Hartshorne’s book. 

>2For one equation with two variables, this amounts to the Fundamental Theorem of Algebra. 
For two equations with three variables, it amounts to the existence part of Bezout’s Theorem as 
formulated in Theorem 8.5. 

3Similarly the computations associated with Grobner bases make it possible to reduce the 
computation of the Hilbert polynomial of a general ideal to the computation of the Hilbert polynomial 
of a monomial ideal. 


620 X. Methods of Algebraic Geometry 


no strictly increasing sequence of prime ideals containing p, can be longer than 
pr SC Pris C --- C p,. It follows that the images of these ideals in A/p 
give a strictly increasing sequence of prime ideals of maximal length and that 
A/p has Krull dimension n — k. By Theorem 10.7 the geometric dimension of 
V(pe) = (€x41,---,€n) 18 n — k. In other words, the geometric dimension of 
the vector subspace (e,41,...,@,) is the same as the vector-space dimension. 
Relabeling indices in this computation, we see that the geometric dimension of 
(€j,,-++5@;) 18 k if the indices j,..., jx are distinct. 

Let us compute the geometric dimension of the zero locus of a general proper 
monomial ideal (M,..., Mi). If a = (a@1,..., Qn) is a tuple of integers > 0, 
we write X° for Xt! -++ XX” and |a| fora; +---+a,. Let Hj = V(X;) be the 
coordinate hyperplane of points in A” with j" coordinate 0. This is the linear 
span of all e; fori A j, and it has geometric dimension n — 1. If a monomial X® 
is given, then Proposition 10.1 shows that 


VX) = U VX) = U A 


aj>0 aj;>0 


and then that 


V(x, x6) = ( U Hi) n (U #) = i Gad). 


a; >0 a;>0, Bj >0 


Similarly V(M,..., My) is a finite union of k-fold intersections of coordinate 
hyperplanes. By Theorem 10.7 the geometric dimension of V(M,, ..., M;) is the 
maximum dimension of the subspaces H; 0 H;M--+ appearing in the appropriate 
union for M,..., My. To get the maximum dimension, we want as few distinct 
indices to appear in an intersection H; 1 H;---. Ifthe smallest possible number 
of distinct indices is m, then we see that V(M1,..., Mx) has geometric dimension 
n—m, 

The insight is that to study V(a), one studies A/a, and that to study the latter, 
one considers what happens as a function of s to the part of A/a that corresponds 
to degree at most s. In the case of a monomial ideal, this means that one is to study 
the monomials outside the ideal in question, particularly how the number of these 
monomials grows with s. Let M be the set of all monomials in k[X, ..., Xn]. 
For our monomial ideal a, let C(a) be the complementary subset to a in M given 
by 

C(a) = {X* | X* gal}. 


Proposition 10.64. If a is a proper monomial ideal in k[X1, ..., Xn], then 
(a) the vector subspace V({X; | PE (fisas xs je}}) is contained in V (a) if 
and only if {x eM. [ae (2). an ,&;,)} is contained in C(a), 
(b) the geometric dimension of V (a) equals the largest vector-space dimen- 
sion of a coordinate subspace that lies in C(a). 


9. Affine Algebraic Sets for Monomial Ideals 621 


REMARK. The hypothesis “proper” is needed for (b), not for (a). 


PROOF. For (a), first suppose that V({X; li ¢ {fi,..., ie}}) is contained in 
V(a), and suppose that o is in (e;,,..., @;,). Let P = (%1,..., Xn) be the point 
with 

1 fori € {ji,..-, Je}; 
xi = | : ‘ : (*) 
0 fori ¢ {ji,..., jx}. 
Then P is on the zero locus of each X; fori ¢ {j1,..., jg}, and hence P is in 
Va). On the other hand, the value of the monomial X° at P is 1. Since the value 
of every member of a at P is 0, X® cannot be in a. Thus X® is in C(a). 

Next suppose that E = V({Xi EA Fiiy aces ik}}) is not contained in V(a). 
Say that P = (x1,...,X,) is in E but not V(a). The condition for P to be in E 
is that x; = O foralli ¢ {j,..., j,}. Since P is not in V(a), some member of a 
is nonzero at P. The ideal is generated by monomials, and thus some monomial 
X™ in ais nonzero at P. Let ag = (@1,..., Q@,). The (nonzero) value of ap on 
P is]; witho;>0%;- Now x; = 0 for alli ¢ {j1,..., jx}, and consequently no i 
outside {j;,..., jx} can have a; > 0. Thus ao is in (e;,,..., e;,), and a exhibits 
{X* eM |a@é (e,,..., &,)} as failing to be contained in C(a). 

For (b), we saw before the proof that V (a) is the union of finitely many vector 
subspaces and that each vector subspace is an affine variety whose geometric 
dimension equals its vector-space dimension. By Theorem 10.7 the geometric 
dimension of V(a), a being proper, is the maximum of the dimensions of these 
subspaces. Taking (a) into account, we conclude that (b) holds. 


We seek a formula for the number of monomials in C(a) of total degree < s 
when s is large and positive. We begin with a lemma. For a monomial ideal 
a, the function carrying each integer s > 0 to the number of X® in C(a) with 
la| < s is called the affine Hilbert function of a and is denoted by H,(s, a). 
For a = k[X,, ..., X,], the affine Hilbert function is identically 0, and we shall 
usually not be interested in this case. 


EXAMPLE. Forn = | with one indeterminate X, the proper ideals of k[ X] are 0 
and (X*) with k > 0. The monomials X% with Ja] < s are 1, X, X7,..., X°. By 
inspection, none of these is in a if a = 0, and thus H,(s, 0) = s + 1. In the case 
of (X*) with k > 0, the monomials X% in C((X)*) are 1, X,..., X*~!, and thus 
Ha(s, (X*)) iss + 1 fors <k—landisk fors >k—1. 


Theorem 10.65. If a is a proper monomial ideal in k[X),..., X,], then the 
complementary set C(a) of monomials is a disjoint union 
C(a) = CoU---UCh, 
where C;, is a finite union of subsets of the form 


E={X*eMlae(e,...,e)+ YL aie}. 
TEL Ais Jk} 


622 X. Methods of Algebraic Geometry 


Here it is assumed that (e;,,..., @;,) is a k-dimensional coordinate subspace and 
the coefficients a; are particular integers > 0. 


REMARKS. The subsets of M of which the above set E is an example will be 
called standard subsets of MV with k parameters. The member 0 jg, j,i.) Wei 
of M is called the associated translation of EF, and (e;,,..., @;,) 1s called the 
associated vector subspace of E. Standard subsets of MM with 0 parameters are 
singleton sets {X°}. An example of a standard subset of M with 1 parameter 
when n = 2 is {XP XS” | ay > 0, ag = 2} = {X* | aw € (e;) + 2e2}. It 
is apparent that the one and only circumstance in which C,, is nonempty is that 
C(a) = M, in which case a = 0. 


PROOF. We proceed by induction on n, and we may assume that a ~ 0. The 
example above shows for n = | that C(a) is a finite set if a is a nonzero proper 
ideal. Thus C(a) = Co in this case, and the base case of the induction is settled. 

Assume inductively that the theorem has been proved for n — | indeterminates, 
and let a be a nonzero ideal in k[X,,..., X,]. Let M,_; and M, denote the 
sets of monomials in X1,..., X,—; and X;,..., X,, respectively. For j > 0, let 
a; be the ideal in k[X), ..., X,—1] of all polynomials f(X,, ..., X,—1) such that 
X} f (X,..., X;_1) is in a. The ideals a; are monomial ideals because a is a 
monomial ideal, and a; € aj+; for all j. Since k[X,,..., X,—1] is Noetherian, 
there is some index / such that a; = q, for all j > /. We apply the inductive 
hypothesis to ao, a1,..., a, writing 


C(aj) =Co,j U---UCr-1,; for 0 < J </. 


Here each C,, ; is a finite union of standard subsets with k parameters in the n — 1 
indeterminates X1,..., Xn—1. 
Let Cy, ; Xi be the set of all products of members of C,,; with X;,. We shall 
show that 
C(a) =CoU---UC), («) 


where Co, ..., C, are defined by 


lore) : I-1 : 
Ce = Cee WL Cag forO<k<n-1 
j=0 j=0 


and Co = C(a) — U Cx. 
k=1 


But first let us see that each Cx41 for 0 < k <n — 1 isa finite union of standard 
subsets of M,, with k + 1 parameters. Each Cx+1,; is a finite union of standard 
subsets of M,,_; with some associated translation y such that y, = 0 and with 
an associated vector subspace (é;,,..., €j,,,) such that jj) < +--+ < jeyy <n. 


Then each Cx41,;X j is a finite union of standard subsets of M of the form X“ 


9. Affine Algebraic Sets for Monomial Ideals 623 


with associated translation y + je, and with the same associated vector space 
(€j,, + +++ jy). Similarly the set Ujzo Cy X isa finite union of standard subsets 
of M with associated translation y + Oe, and with associated vector space of 
the form (e;,,..., ej, @n). Thus Cx+, is a finite union of standard subsets of My, 
with k + 1 parameters. 

Let us verify («). The most general monomial in k[X, ..., X;,] is xP x! with 
X? in k[X,..., X,—1], and this monomial is in a if and only if X? is in aj. 
Hence X?X/ is in C(a) if and only if X* is in C(a;). Since aj = a for j > J, 
C(aj) = C(a;) for j = 1. Thus 


Cla) = (U C(ai)Xh) U (U C(a)Xi). (%) 


If j < J, then X?X! € C(a) implies X®x/ © C(a), since X), 4a C a. Therefore 
C(aj) > C(a;) for all 7 < /, and we see that j < / implies that C(aj) = 
C(aj) U C(a;). Substituting into (**) and rearranging terms gives 


cla) = (UY e¢a)Xi) v (U Cla) Xi). cH) 


For j < J, XP is in C(a;) if and only if XP is in one of Co,j.--+»Cn-1,;. Thus 
we can rewrite (+) as 


ca) =(U U CeaXh) U (U U Cx,jXi) 
= j=0 k= 


j=lk 


oo n—-l F 1-1 n—2 j I-1 ; 
=(U U Cuxt) UCU U Crist) U (CU Cos X4). 
0 j=0 k=0 j=0 


jalk= 


The first term on the right side contributes to C,.41, with e, to be adjoined to the 
basis vectors of the associated vector subspace (e;,,..., €;,). Equating the terms 
on the two sides that contribute to C;,4, therefore yields (*). The set Co is the 
last term on the right side. This is finite because each Co, ; is finite, and therefore 
Co has the correct form. 


Lemma 10.66. Let £ be a standard subset of MM with k parameters, and let 
y be its associated translation. Then the number of monomials X° with |y| < s 
such that a is in E is equal to the binomial coefficient 


(* +s— ly } 

s—ly| 
if s > |y|. This expression is a polynomial function of s of degree k, and the 
coefficient of s* is 1/k!. 


624 X. Methods of Algebraic Geometry 


PROOF. Let (e;,,..., @;,) be the associated vector subspace for E. The asso- 
ciated translation y is assumed to have y; = O fori in {j;,..., j,}. We are to 
count monomials X° = X’X* with B in (e;,,...,e;,) and with |y + B| < s. 


Since |y| + |B] = |v + B| < s, the latter condition on f is that |B| < s — ly|, 
which by assumption is > 0. The entries of 8 are allowed to be arbitrary nonzero 
integers in the k entries j,,..., j,, Subject only to the limitation that the sum of 
the entries is to be < s — |y|. The number of such 6’s equals the number of 
homogeneous monomials in k + 1 variables of total degree equal to s — |y|. This 
number is recalled in a bulleted list in Section 3 and is Ca) = Cay 
When expanded out, this binomial coefficient equals 


us +k-lyl(s+k—-1-ly)---(@+1-lyp, 


which is a polynomial function of s of degree k with leading coefficient 1/k!. 


Lemma 10.67. Let E and F be standard subsets of MM with k and/ parameters, 
respectively. Then E / F either is empty or is a standard subset of M with m 
parameters, where m < min(k,/). Moreover, the only way that m can equal 
max(k, /) is for E to equal F. 


PROOF. Denote the respective associated translations for F and F by yg and 
yr, and let Sz and Sp be the subsets of {1,...,} such that (e; | i € Sg) and 
(e; |i € Sy) are the associated vector spaces for EF and F, respectively. Let Tr 
be the subset of indices 


Te = {i €{l,...,n}| (ve) > O}~ 


and define Ty similarly. We are given that |Sz| = k and |Sr| =/. Also, we are 
given that Se OTe = @ and Sr NTr = @, ie., that Te C Si and Tr C Sj. If 
EQ F #@, then there exist x and y with 


Vetx=yrty such that x; = 0 fori ¢ Sg; and y; =O for j ¢ Sr. (*) 


Then x; = y; = O fori € Si Sj, and we see that a necessary condition to have 
EQ F # Gis that (yz); = (yr); fori € Si, St. In this case the x and y in (*) 
must have x; = (yr); fori € Se MS‘, and yj = (ye); fori € SEN Sr. 

Conversely if (ye)i = (vr)i fori € SiN S%, then we can define x; = (yr); 
fori € SeN S%, yi = (ve)i fori €¢ SM Sp, and x; = y; to be arbitrary for 
i € Sg 1 Sp, and we obtain solutions of (*). It is evident that all solutions of 
(*) are obtained this way. Consequently E / F is the standard subset of M with 
|Sz  Sr| parameters; with associated translation y having y; equal to yg on SF, 
equal to yr on S¥, and equal to 0 on Sz N Sr; and with associated vector space 
(e; |i € S), where S = Sz Srp. 

The inequality dimy(Sg 9 Sr) < min(dimy Sz, dim, Sr) is the inequality 
m < min(k, 1) of the lemma. If m = max(k, /), then we must have § = Sp = Spr 
and an equality (yz); = (vr); fori € SiO Sy, i.e., fori ¢ S. The latter equality 
implies that ye = yr. Hence E = F. 


9. Affine Algebraic Sets for Monomial Ideals 625 


Theorem 10.68. If a is a monomial ideal in kLX,,..., X,,] such that V (a) has 
geometric dimension d, then there exists a polynomial H,(s, a) in one variable 
of degree d such that the affine Hilbert function 7/,(s, a) is equal to H,(s, a) for 
all positive s sufficiently large. The leading coefficient of H,(s, a) is positive. 


REMARK. The polynomial H,(s, a) is called the affine Hilbert polynomial 
of the monomial ideal a. It is of course uniquely determined. 


PROOF. For s sufficiently large, we are to count the number of monomials 
X* with |a| < s lying in the complementary set C(a) to a. Proposition 10.64b 
and Theorem 10.65 together show that C(a) = Cy U--- U Cg disjointly, with C, 
equal to a finite union of standard subsets of MM with k parameters and with Cy 
nonempty. The sets C, being disjoint, it is enough to show that the number of 
such monomials in C; is a function equal for large s to a polynomial of degree k, 
provided Cx is nonempty. 

According to Lemma 10.66, if E is a standard subset of M with k parameters, 
ifs > Ois sufficiently large, and if y is the translation parameter, then the number 
of monomials X° in E with |a| < s is a) ifs > |y|, which is a polynomial 
of degree k with positive leading coefficient. 

Because the sets E of this kind whose finite union is Cy, may not be disjoint 
and because we seek an exact answer for the cardinality |C;,| when s is large, we 
cannot simply add finitely many such expressions to obtain a value for |C;,|. We 
have to take into account the overlaps of the various sets E. Thus suppose that 
Cy = FE, U---U E, for standard subsets E),..., E, of M with k parameters. 
Without loss of generality, we may assume that no two of the sets E;,..., E, are 
equal to one another. Let E\(s),..., E,(s) be the respective subsets of elements 
a with |a| < s. We use the inclusion—exclusion formula, namely 


l 


’ 


U E\(s)| = DIE@I- Y En ONE +E DY 


1, <i2 I <1+<ij 


Ej, (s) 
j=! 


this is a formula in Boolean algebra that is readily proved by induction on r 
starting from the formula |EF U F| = |E|+|F|—|E FI. 

Lemma 10.66 shows that >"; | E;(s)| is a sum of functions equal for large s > 0 
to polynomials of degree d with positive leading coefficient. The leading coeffi- 
cients cannot cancel, and thus the sum is for large s > 0 equal to a polynomial 
of degree d with positive leading coefficient. Each of the remaining terms on 
the right side of the inclusion—exclusion formula, according to Lemma 10.67, is 
plus or minus the number of monomials @ with |a| < s in some standard subset 
E of M whose number of parameters is < d. Hence the sum of all those terms 
is a function equal for large s to a polynomial that is 0 or has degree < d. The 
theorem follows. 


626 X. Methods of Algebraic Geometry 


Proposition 10.69. A polynomial P(s) in one variable of degree d takes 
integer values for s sufficiently large and positive if and only if it is an integer 
linear combination of the polynomials s > (5) for0O < j <d. 

PROOF. The sufficiency is immediate because (*) is an integer for each j and 
s. For necessity, suppose that P(s) is integer-valued and has degree d. Since 
Sb (;) is integer-valued of degree j with leading coefficient 1/j!, P(s) is 
certainly a rational linear combination of the polynomials s +> C); We prove 
by induction on d that the coefficients are integers. For deg P(s) = 0, we have 


(5) = |, and there is nothing to prove. Given an integer-valued P(s) of degree 


d, write P(s) = ae aj (). Form 


d 
AP(s) = P(s +1) — P(s) = al (4) - 
j=0 ; j 
the third equality holding by Pascal’s triangle. Since AP(s) is integer-valued 
and has degree d — 1, the inductive hypothesis shows that aj; is an integer for 


0 < j <d—1;1.¢.,a; isan integer for 1 < j < d. Therefore Q(s) = ae a;(‘) 
is integer-valued. Since P(s) — Q(s) = ap is integer-valued and constant, ao is 


an integer. 


Corollary 10.70. If a is a monomial ideal in k[X ,..., X,] such that V(a) 
has geometric dimension d, then the affine Hilbert polynomial H,(s, a) of a is of 


the form H,(s, a) = ye, a;( ea) with integer coefficients a; and with ap > 0. 


PROOF. This follows by combining Theorem 10.68 and Proposition 10.69. 


10. Hilbert Polynomial in the Affine Case 


We continue with an algebraically closed field k and with the polynomial ring 
A =k[X,..., X,]. Let a be an ideal in A. For each integer s > 0, let A<, be 
the vector subspace of A consisting of 0 and all elements of degree at most s, and 
put a<; = aM A<;. The inclusion of A<, into A descends to a k linear mapping 
Az<s/d<,; — A/a, and this is one-one because A<; Ma C a<;. Thus we can 
regard A<;/d<s, aS s varies, as a sequence of successively better approximations 
to A/a. We define the affine Hilbert function +, (s, a) of a by 


Ha(s, a) = dim, A<;/d<s for s > 0. 


When a is a monomial ideal, this function is the one that was investigated in 
the previous section. In fact, the monomials of degree < s form a vector-space 


10. Hilbert Polynomial in the Affine Case 627 


basis of A<;, and the monomials in a of degree < s form a basis of a<, because a 
is spanned by monomials. If C(a) denotes the set of monomials not in a, then the 
monomials of degree < s within C(a) descend to a basis of A<s/a<,. The number 
of such monomials gives the value of the affine Hilbert function as defined in the 
previous section, and thus the new definition is consistent with the old one in the 
case of monomial ideals. 

When a is a proper monomial ideal, we found in Theorem 10.68 that H,(s, a) 
equals a polynomial function of s for s sufficiently large and that the degree of 
this polynomial function equals the geometric dimension of the zero locus V (a) 
in the affine space A”. Our goal in this section is to show that these conclusions 
remain valid for all proper ideals a. The polynomial function that results for such 
an a will be called the affine Hilbert polynomial of a. 

We shall make the connection between general ideals a and monomial ideals 
by means of the theory of Sections VIII.7—VIII.10. We recall the notion of a 
monomial ordering as defined in Section VII.7. A monomial ordering < is said 
to be a graded monomial ordering if |6| < |a| implies X? < X%. The graded 
lexicographic ordering and the graded reverse lexicographic ordering (Examples 2 
and 3 in Section VIII.7) are examples of graded monomial orderings, but the 
lexicographic ordering in Example 1 in that section is not a graded monomial 
ordering. 

Fix a graded monomial ordering. As in Section VIIL7, LT(f) denotes the 
leading monomial term of the polynomial f. By convention, LT(O) = 0. For our 
ideal a, we let LT(a) be the vector space of all linear combinations of polynomials 
LT(f) for f € a. This is an ideal in A, and it is amonomial ideal. The connection 
between the goal of this section and the results of the previous section rests on 
the following remarkable theorem. 


Theorem 10.71 (Macaulay). Let a graded monomial ordering be imposed 
on k[X,,..., X,]. If ais any ideal in k[X,,..., X,], then the affine Hilbert 
functions of a and LT(a) coincide: 7H(,(s, a) = Hg(s, LT(a)). 


PROOF. Fix s > 0. It is enough to prove that a<, and LT(a)<, have the same k 
dimension. Since there are only finitely many monomials of degree < s, we can 
choose f1,..., fm in a such that their leading monomials LM(f1), ...,LM(fx) 
are distinct and form a vector-space basis of LT(a)<,. Without loss of generality, 
we may assume that LM(fi) > --- > LM(fx). Certainly dimLT(a)<,; = k, and 
thus it is enough to show that fi, ..., f% lie in a<, and form a vector-space basis 
of des. 

For each j, LM(f; — LT(f;)) < LM(fj). Since the monomial ordering is 
graded, this inequality implies that deg(f; — LT(fj)) < s. But we know that 
deg(LT(f;)) < s, and therefore deg f; < s. Consequently fj lies in a<;. 


To prove that { fi, ..., f;} is linearly independent, suppose that am cj fj =9 
with allc; ink. Arguing by contradiction, suppose that not all c; are 0. Let i be the 


628 X. Methods of Algebraic Geometry 


least index j for which c; ¢ 0; then LM(f;) = LM(c; f;) = LM ( = ae c; fi) < 
max;,; LM(f;), and we arrive at a contradiction. We conclude that {f),..., fx} 
is linearly independent. 

To prove that {fi,..., f;} spans a<;, we again argue by contradiction. Among 
all g in a<s with g not in the linear span of {f1,..., fx}, choose one for which 
LM(g) is the smallest. Certainly LM(g) is one of LM(fi),..., LM(fx). Say that 
LM(g) = LM(f;). For some scalar c 4 0, we must have LT(g) = LT(cf;). Then 
LM(g — cf;) < LM(g), and the minimality of LM(g) forces g — cf; to be in the 
linear span of {f1,..., f;}. Since cf; is in the linear span, so is g, contradiction. 
Thus {f1,..., f¢} 18 a spanning set of a<s. 


Corollary 10.72. If a is an ideal in k[X,,..., X,,], then for all s sufficiently 
large, the affine Hilbert function 1{,(s, a) of a equals a polynomial in s of the 
form sa aj ( 7) with integer coefficients a; and with ap > 0. 

REMARKS. The polynomial in the statement of the corollary is called the affine 
Hilbert polynomial of a and is denoted by H,(s, a). It is the 0 polynomial if and 
only if a = k[X,,..., Xj]. 


PROOF. Theorem 10.71 says that H(,(s, a) = Ha(s, LT(a)). Consequently the 
result follows immediately by applying Corollary 10.70 to LT(a). 


Corollary 10.73. If a graded monomial ordering is imposed on k[X,, ..., Xn] 
and if a is any ideal in k[X,,..., X,,], then the affine Hilbert polynomials of a 
and LT(a) coincide: H,(s, a) = Hg(s, LT(a)). 


PROOF. This is immediate from Theorem 10.71 and the definition of the affine 
Hilbert polynomial given in the remarks with Corollary 10.72. 


Corollary 10.74. If a and 6 are proper ideals of kLX1,..., X,] such that 
a C b, then deg H,(s, a) > deg Ha(s, 6). 


PROOF. Introduce a graded monomial ordering. The inclusion a C 6 implies 
that LT(a) C LT(6). Therefore C(LT(a)) > C(LT(b)). Proposition 10.64b shows 
that the geometric dimension of V (LT(a)) is the largest vector-space dimension 
of a coordinate subspace that lies in C(LT(a)), and the same thing is true for 
LT(b). Thus the geometric dimension of V (LT(a)) is > the geometric dimension 
of V(LT(6)). By Theorem 10.68, deg H,(s,LT(a)) > deg H,(s, LT(6)). The 
result now follows immediately from Corollary 10.73. 


The affine Hilbert polynomial H,(s, a) of a depends on a, not just V (a), but 
we shall be interested mainly in the degree of H,(s, a). Proposition 10.76, as 
amplified in Corollary 10.77, implies that the degree depends only on V(a). It 
requires a lemma. 


10. Hilbert Polynomial in the Affine Case 629 


Lemma 10.75. If a is a monomial ideal in k[X,,..., X,], then so is /a. 


PROOF. The preliminary remarks in Section 9 show that V (a) is a finite union 
of coordinate subspaces. Let us write V(a) = U; E; accordingly. By Proposition 
10.2b, fa = I(V(a)) = 1(U; £;) =; 1 (E;)- Since E; is an affine variety 
and is equal to V(X;,,..., Xi,) for suitable X;,,..., X;,, the Nullstellensatz 
shows that /(£;) is an ideal of the form /(£) = (Xj,,..., Xi,). This is a 
monomial ideal, and it is therefore enough to show that the finite intersection of 
monomial ideals is a monomial ideal. By induction it is enough to show that 
6b Mc is amonomial ideal if 6 and c are monomial ideals. If an element of 6 Mc is 
given, then that element is a linear combination of the monomials in 6 and is also 
a linear combination of the monomials in c. Since M is linearly independent, the 
element is a linear combination of monomials lying in 6 Mc. Therefore 6M cis a 
monomial ideal. 


Proposition 10.76. If a is a proper ideal in k[X,,..., X;,], then the degrees 
of the affine Hilbert polynomials H,(s, a) and H,(s, ,/a) are equal. 


PROOF. Fix a graded monomial ordering. We begin by proving that 


LT(a) CLT(/a) € VLT(a). (x) 


The left-hand inclusion is immediate because a C /a. For the right-hand 
inclusion, let f 4 0 be in ,/a, and let X* = LM(f) be the leading monomial of 
f. Since f isin./a, f” is ina for somer > 0. Since the leading monomial of a 
product is the product of the leading monomials, LM(f’) = X’%. Thus a power 
of X% is exhibited as in LT(a), and X% is in ./LT(a). This proves (x). 

Applying Corollary 10.74 to (*), we obtain 


deg H,(s, LT(a)) > deg Ha(s, LT(/a)) > deg Ha(s, VLT(a)). ——(e) 
The ideal LT(a) is a monomial ideal, and Lemma 10.75 shows that ./LT(a) is a 


monomial ideal. Then LT(a) and ./LT(a) are monomial ideals with V(LT(a)) = 
V(/LT(a) ), and Theorem 10.68 shows that 


deg H,(s, LT(a)) = deg Ha (s, VLT(a) ). 
Comparing this conclusion with (+), we see that 
deg H,(s, LT(a)) = deg Ha(s, LT(V/a)). (1) 


In combination with the equalities H,(s,a) = Ha(s,LT(a)) and Hy(s,./a) = 
H,(s, LT(./a)) given by Corollary 10.73, (+) completes the proof. 


630 X. Methods of Algebraic Geometry 


Corollary 10.77. If a and 6 are proper ideals in k[X,,..., X;,] with V(a) © 
V (6), then deg H,(s, a) < deg H,(s, 6). 


Proor. Application of /(-) to the inclusion V(a) C V(b) gives /a = 
I(V(a)) D I(V(6)) = 6. Then Corollary 10.74 and Proposition 10.76 together 
yield deg H,(s, a) = deg H,(s, /a) < deg Hy(s, Jb) = deg H,(s, 6). 


Theorem 10.78. If a is a prime ideal in k[X,,..., X,,], then the degree of the 
affine Hilbert polynomial H,(s, a) equals the geometric dimension of the affine 
variety V (a). 


PROOF. Define d = deg H,(s, a) and V = V(a), and let A(V) be the affine 
coordinate ring A(V) = k[X1,..., Xn]/a. Theorem 10.7 shows that dim V 
equals the Krull dimension of A(V), and Theorem 7.22 shows that the latter 
equals the transcendence degree over k of the field of fractions k(V) of A(V). 
Thus the theorem will follow if we show that k(V) has transcendence degree d 
over k. 

Let g : k[X1,..., Xn] — A(V) be the quotient homomorphism, and put 
x; = o(X;) for 1 < i <n. Introduce a graded monomial ordering on M. 
Corollary 10.73 shows that H,(s, a) = H,(s, LT(a)), and Theorem 10.68 shows 
that V (LT(a)) has geometric dimension d. We saw in Section 9 that the zero locus 
of a monomial ideal is the finite union of coordinate subspaces, and it follows 
that V(LT(a)) C A” contains a coordinate subspace F of dimension d. Let E 
have as basis the standard vectors e;,,..., e;,, So that 


E=V({Xi li ¢ (i... jal). 


The set F is a variety, and thus /(E) = ({X; li é {ji,.--; fat). Also, E ¢ 
V(LT(a)), andhence /(E) > I(V(LT(a))) D LT(a). If X° isa monomial in LT(a), 


then it follows that X® lies in the ideal generated by the X; fori ¢ {j1,..., ja}. 
We can summarize this fact as follows: if we write k[X;,,..., X;,] for the subring 
of k[X,,..., X,] of polynomials involving only X;,,..., X;,, then 
LT(a) NK[X;,,..., Xj,] = 0. (*) 
If f is any nonzero member of k[X;,, ..., Xj,], then its leading monomial LM(/f) 
has to lie in kLX;,,..., Xj,], and thus (+) implies that 
aNk[X;,,...,Xj,] = 0. (**) 


Using (**) and notation introduced at the beginning of Section VII.4, we 
shall show that x;,,..., xj, are algebraically independent over k, and then it 
follows that d < tr.deg A(V). Thus suppose that g(Y1,..., Ya) is a polynomial 
in k[Y,..., Ya] such that g(x;,,...,x;,) = 0. We can identify k[Y,..., Ya] 


10. Hilbert Polynomial in the Affine Case 631 


with kLX;,,..., Xj,] © kLX1,..., Xn], and then the equality g(x;,,..., x;,) =0 
means that y(g) = 0, i.e., g isin a. Hence g is a member of aN k[X;,,..., Xj,], 
and g = 0 by (*«). Therefore x;,,..., x;, are algebraically independent over k. 


For the reverse inequality, we are to prove that d > tr.deg A(V). Letr = 
tr.deg A(V). The elements x; = g(Xj;) generate A(V) as a k algebra, and 
therefore they generate k(V) over k as a field. By Lemma 7.6b some subset 
{Xj,,.-., Xj,} OF (x1, ..., X,} is algebraically independent. Consider the substi- 
tution homomorphism 

Uh) = Aaj, Xp) 


of k[Y,..., Y,] into A(V). This is one-one because the elements x;,,..., xj, by 
assumption are algebraically independent. Fix s > 0, and consider the restriction 
of wy tok[M,..., Y-Jes. Ifh(M%,..., Y-) isa monomial Y® in k[M%,..., Yi Jes 
with a = (a,...,a@,) and |a| < s, then we see that 


w(Y*) = Ti = e(T] a Gia e 


In other words, y(Y¥%) is the image under g of a member of k[X),..., Xn] of 
degree < s. Taking linear combinations of such monomials, we see that y(h) is 
a one-one k linear mapping 


wv : k[Y1, sey Y,-J<s aad kLX1, ee ey Xnl<s/A<s c A(V). 
Therefore 


H,(s, a) = dim, (k[X1,..., Xnles/es) = dim k[Y),..., ¥eles = (7). 


The binomial coefficient on the right side is a polynomial of degree r in s with 


positive leading coefficient. The left side is a polynomial in s of degree d. The 
inequality forces d > r, and the proof is complete. 


Proposition 10.79. If a and 6 are proper ideals in k[X;,..., X,], then 
deg H,(s, ab) = max (deg H,(s, a), deg H,(s, b)). 

REMARKS. Proposition 10.1 points out that V(ab) = V(a) U V(6). Since the 
degree of the affine Hilbert polynomial of a depends only on V (a), this proposition 
says that the degree associated with the union of two affine algebraic sets is the 
larger of the degrees associated with each of the sets. 


PROOF. Impose a graded monomial ordering on M. Let us check that 
(LT(a))(LT(b)) C LT(ab) C LT(aN b) C ,/(LT(a))(LT(b)) (*) 
In fact, let f be in a and g be in 6, and define X* = LM(f) and X? = LM(g) 


to be the leading monomials of f and g. Then X%+? = LM(fg), and hence 
the product of any generator of LT(a) and any generator of LT(6) lies in LT(ab). 


632 X. Methods of Algebraic Geometry 


This proves the first inclusion of («). The second inclusion is immediate because 
ab Carb. If X“ = LM(f) with f € aM b, then (X%)? = LM(f)LM(f) is in 
LT(a) LT(b). Hence X® is in ,/ (LT(a)) ( LT(6)). Thus a generating set of LT(aM b) 
lies in ,/ (LT(a)) ( LT(6)), and the third inclusion of («) follows. 


In (x), the values of V(-) on the end two members are the same, according to 
Proposition 10.3c, and therefore 


V(L1T(a) LT(6)) = V(LT(ab)). (*) 


The proposition now follows from the computation 


max (deg Ha (s, a), deg Ha (s, b)) 
= max(deg H,(s, LT(a)), deg Ha(s, LT(6))) by Corollary 10.73 


= max(dim V (LT(a)), dim V (LT(b))) by Theorem 10.68 
= dim (V(LT(a)) U V(LT(6))) by Theorem 10.7 

= dim(V (LT(a) LT(6)) by Proposition 10.1c 
= dim V (LT(ab)) by (x) 

= deg H,(s, LT(ab)) by Theorem 10.68 
= deg H,(s, ab) by Corollary 10.73. 


Corollary 10.80. If a is any ideal in k[X;,..., X,], then the geometric 
dimension of the affine algebraic set V (a) equals the degree of the affine Hilbert 
polynomial H,(s, a). 


PROOF. Write V(a) = Sia V; as a finite union of affine varieties V;, and de- 
fine p; = [(V;). Since V; is irreducible, p; is prime. Moreover, V; = VU(V;)) = 
V(p;). Then Proposition 10.1c shows that V(p;p2--- px) = es Vi(pj) = 
Gia V; = V(a). Proposition 10.79 and induction give 


deg Ha (s, pip2--- Px) = ee deg H,(s, pj), 


and Theorem 10.78 shows that the right side equals max;<j<,dimV(p;) = 
max)<j<, dim V;, which equals dim V (a) by Theorem 10.7. 


As a consequence of Corollary 10.80, we obtain an algorithm for computing 
the dimension of an affine algebraic set V when given an ideal a whose locus 
of common zeros V(a) is V: We introduce any graded monomial ordering and 
compute LT(a), using a Grébner basis. Corollaries 10.73 and 10.80 together say 
that dim V (a) = dim V(LT(a)). The remarks before Proposition 10.64 show how 
to compute dim V (LT(a)), and Proposition 10.64b gives an alternative method of 
computation. 


11. Hilbert Polynomial in the Projective Case 633 
11. Hilbert Polynomial in the Projective Case 


In this section we consider the analog for projective space of the theory of 
Section 10. We continue with k as an algebraically closed field, and we let 
A =k[Xo,..., Xn]. Our interest is in the zero locus V(q) in P”, as defined in 
Section 3, of a homogeneous ideal a in A. To relate matters to Section 10, we 
shall make use of the cone C(V(a)) over V (a), which was defined in Section 3 
as 


C(V(a)) = (0,...,0) U{(a0,...,4) € A"t! | [x0, -.., Xn] € Vd}. 


The homogeneous ideal a is in particular an ideal in n + 1 variables, and its 
associated affine algebraic set is the subset C(V(a)) of A”*!. An affine Hilbert 
polynomial H,(s, a) is therefore associated to C(V(a)), and its degree matches 
the geometric dimension of C(V (a)). 

To get something directly related to the projective algebraic set V (a) in pro- 
jective space P”, we make a new definition of Hilbert function. Let Ay = 
k[Xo0,..-, Xn]s be the subspace A of all polynomials homogeneous of degree 
s. If ais a homogeneous ideal in A, letas = aM As. The Hilbert function”* of 
a is the integer-valued function of s > 0 defined by 


H(s, a) = dimx As /As for s > 0. 


We have Az — As ® Acs-t; and the fact that a is homogeneous implies that 
d<s = ds Pa<;_;. Consequently A<;/a<; = A;s/ds ® A<s—1/d<;—1. Therefore 


H(s, a) = Ha(s, a) — Ha(s — 1, a). 


This is the fundamental formula by which the algebraic part of the theory of the 
Hilbert function in the projective case can be reduced to the corresponding theory 
in the affine case. 

We know that the affine Hilbert function is a polynomial for large s. Since 


sf 25 (s = 19% = sd! st? s¢-3 sspeaval ( 14H! 


is a polynomial of one lower degree and with positive leading coefficient, it 
follows that the Hilbert function of a is a polynomial for large s, that its degree 
is dim C(V(a)) — 1, and that its leading coefficient is positive. This polynomial 
is called the Hilbert polynomial of a and is denoted by H(s, a). To connect the 
geometric part of the theory of the Hilbert function in the projective case to the 
corresponding theory in the affine case, we use the following proposition. 


>4Tt is traditional not to include the word “projective” or any subscript, even though the termi- 
nology is meant to refer to the projective case. 


634 X. Methods of Algebraic Geometry 


Proposition 10.81. If a is a homogeneous ideal in k[Xo, ..., X,] and if the 
corresponding projective algebraic set V (a) is nonempty, then 


dim C(V(a)) = dim V(a) + 1. 


PROOF. The proof of Corollary 10.13 shows that C(V(a)) is irreducible in 
A"*! if and only if V(a) is irreducible in P”. Since the dimension in both cases 
for a general a is the maximum of the dimensions of irreducible closed subsets, 
it is enough to prove the dimensional equality in the irreducible case. 

If we have a strictly increasing sequence of irreducible closed subsets Ey G 
E, &---& Eq inP", then each C(E)) is irreducible in A"*! and the sequence 
C(Eo) & C(E1) & --- & C(Eq) in A"*! consists of Zariski closed sets that are 
irreducible. Since the subset {0} of A”*! is irreducible and can be adjoined at the 
beginning of the latter sequence, we conclude that dim C(V(a)) > dim V(a) +1. 

We need to prove the reverse inequality in the irreducible case. Since V (a) is as- 
sumed irreducible (and hence nonempty), we may assume that a is prime and omits 
at least one of Xo, ..., X,. To fix the notation, say that Xo is notin a. Recall from 
Section 3 the substitution homomorphism Bo > KLXo0,..., Xn] > KLM, ..., Xn] 
formed by setting Xo = 1. Letb = Bo (a). This is a prime ideal in k[X1,..., Xn], 
according to Theorem 10.20. Let A(C(V(a))) = k[Xo,..., Xn]/a and A(V (6)) 
= k[X),..., X,]/6. The homomorphism £j descends to a homomorphism of 
A(C(V(a)) onto A(V (6)), which we denote by Be 

Let xo, ..., Xn be the images of Xo,..., Xn, in A(C(V(a))). The element xo 
is transcendental over k. In fact, the only alternative is that it is a scalar c, since 
k is algebraically closed; the equality x» = c would imply that Xo — c is in a, 
and the fact that a is homogeneous would imply that Xo and c are separately 
in a, in contradiction to our choice of Xo. Consequently k(x0) (11, ..., Xn) 
has transcendence degree r = dimC(V(a)) — 1 over k(x). Since x1,..., Xn 
generate k(xo)(X1,...,%n) aS a field over k(xo), some subset {x;,,..., xj} of 
{x1,..., X,} 1s a transcendence basis of k(xo)(x1,..., Xn) aS a field over k(x). 
Thus {X0, x;,,..-, Xj, } is a transcendence basis of k(xo, ..., X,) over k. 

The elements xo, x;,,..., x; all lie in A(C(V(a))), and we consider their 
images 1, B5(x;,),..-, Bj(x;.) in A(V(b)). Suppose that h(Yi,..., ¥,) is a 
polynomial in r variables exhibiting the last r of these images as algebraically 
dependent. That is, suppose that 


h(Borj)s +--+ Boa.) = 0. (*) 


Let h have degree d. We regard h as a member of k[X),..., X,]<q¢ that depends 
only on X;,,..., X;,. With this notational change, («) reads 


h(X,..., Xn) is in b. (+) 


12. Intersections in Projective Space 635 


We now refer to the details of the proof of Theorem 10.20 that are summarized 
before Proposition 10.33. The linear mapping gg with gg(f)(Xo,..., Xn) = 
XSF (Xa Xo; ...,X,/Xo) is a two-sided inverse to B : K[X0,..., Xnla > 
k[X1,..., XnJ<g. Put H = gg(h), so that h = Bi (A). The detail in question is 
that 

aNk[Xo,...,Xnla = ga(6NK[X1, ..., Xnl<a). (F) 


By (*), @q(h) is in the right side of (7). Since (7) is a valid identity, gg (h) is in the 
left side. So H isina. This means that H (xo, ..., Xn) = 0. Remembering that H 
depends only on Xo, X;,,..., Xj, and that {xo, xj,,..., x; } isa transcendence set, 
we see that H = 0. Thereforeh = 0, and { Bi, (Cro Pane Bi, (x;,)} is a transcendence 
set in A(V(6)). Thus 


dim V(b) = tr.deg A(V(b)) > r = tr. deg A(C(V(a)) — 1 = dimC(V(a)) — 1. 


By Corollary 10.19, dim V(b) = dim V(a). Hencedim C(V(a)) < dim V(a)+1, 
and the proof is complete. 


Corollary 10.82. If a is a homogeneous ideal in k[Xo,..., X;,] and if the 
corresponding projective algebraic set V(a) is nonempty, then dim V (a) equals 
the degree of the Hilbert polynomial H(s, a). 


PROOF. This is immediate from Proposition 10.81 because dimC(V(a)) = 
dim H,(s, a) and because deg H(s, a) = deg H,(s, a) — 1. 


We could also obtain a corollary relating H(s, V(a)) and H(s, V(LT(a))) when 
a graded monomial ordering is imposed, and we could then give a geometric way 
of visualizing the dimension in terms of the projective case. But we shall not 
need these details, and we omit them. 


12. Intersections in Projective Space 


Hilbert polynomials are an appropriate tool for dealing with how a projective 
algebraic set intersects a lower-dimensional projective space. In this section we 
consider such intersections, and we obtain as a corollary the deep result that a 
system of homogeneous polynomial equations over an algebraically closed field 
k always has a nonzero solution if there are more variables than equations. 

It will be convenient in this section to adopt the convention that the empty 
projective algebraic set has dimension —1 and that the 0 Hilbert polynomial has 
degree —1. To make use of this convention, we recall from the homogeneous 
Nullstellensatz (Proposition 10.12a) that ahomogeneous ideal aink[Xo, ..., Xn] 
has V(a) empty in P” if and only if there is an integer N such that a contains 
k[Xo0,..., Xn]x fork => N. In this case our definition makes C(V(a)) consist 


636 X. Methods of Algebraic Geometry 


of {0} alone.2> With the convention that such ideals have dim V(a) = —1 and 
C(V(a)) = {0}, the formula of Proposition 10.81 remains valid, and we can 
therefore drop the assumption that V(a) is nonempty. As to Corollary 10.82, 
the definition of the Hilbert function when a contains k[Xo,..., X,]x for all 
sufficiently large k makes 1{(k, a) = 0 forsuchk; therefore the Hilbert polynomial 
in this case is the 0 polynomial, and Corollary 10.82 continues to be valid even 
when V (a) is empty. 


Theorem 10.83. If a is any homogeneous ideal in k[Xo,..., X,] and if F is 
a homogeneous polynomial, then 


dim V(a) > dim V(a+ (F)) > dim V(a) — 1. 
In particular, V(a + (F)) is nonempty if dim V(a) > 1. 


PROOF. Since a C a+ (F) and since V(-) is inclusion reversing, we know 
that 
dim V(a) > dim V(a+ (F)). 
To obtain the second inequality of the theorem, we shall compare the Hilbert 
polynomials H(s, a) and H(s,a-+ (F)), taking advantage of Corollary 10.82. 
Let d = deg F,, and suppose that s > d. The identity mapping on k[Xo, ..., XnJs 
descends to a k linear mapping 


Q : k[Xo, sey Xnls/Gs ae k[Xo, see Klefle + (F))s, 


and ¢@ is onto, being formed from an onto map. To understand ker ¢, we shall use 
the k linear map 


wv : k[Xo, tty XnJs—a/As—d > k[Xo, sey XnJs/As 


induced by multiplication by F', which we view as carrying k[Xo0,..., XnJs—a 
into k[Xo,..., Xn];/as. Observe that if G is in k[Xo,..., Xn]s;—a, then FG is 
in (a+ (F)),, and therefore g o W = 0, i.e., image y C kerg. 

We shall prove that equality holds. Thus suppose that G is a member of 
k[Xo,..., Xn]; such that G + a, is in ker@g, ie., that G is in (a+ (F));. Then 
we can write G = G; + HF with G; ina, and A in k[Xo,..., Xn]s—a. So 
G — G,; = HF, and the coset G+ a, = G—G, +a, is W of H + a,_g. We 
conclude that image w = ker g. 

Now we compute 


dim, k[Xo, ..., XnJs/ds 
= dim,(domain g) = dim, (ker g) + dim, (image ¢) 
= dim, (image y) + dim, k[Xo, ..., Xn]s/(a + (F))s 
< dim, K[Xo, ..., XnJs—a/ds—a + dim, k[Xo,..., Xnls /(a + (F))s. 


>> Admittedly the inclusion of {0} in the cone might seem unnatural if a = k[Xo0,..., Xn], but 
that is the definition that makes this particular a behave like all other ideals. 


12. Intersections in Projective Space 637 


In terms of Hilbert functions, this says that 

H(s,a) < H(s —d,a)+H(s,a+(F)). 
For large s, this is an inequality of polynomials: 

H(s,a) < H(s —d,a)+ H(s,a+(F)). 


Since H(s, a) — H(s — d, a) is a polynomial of one lower degree than H(s, a) 
with leading coefficient positive, we obtain 


deg H(s, a) — 1 < deg H(s,a+(F)). 


The second inequality of the theorem now follows from Corollary 10.82. The 
final assertion in the theorem takes into account the remarks in the paragraph 
preceding the statement of the theorem. 


Corollary 10.84. If a is any homogeneous ideal in k[Xo,..., X,] and if 
F\,..., F, are homogeneous polynomials, then 


dim V(a) > dim V(a-+ (F),..., F,)) => dim V(a) — r. 


In particular, V(a + (Fi, ..., F-)) is nonempty if dim V (a) > r. 


PROOF. We use Theorem 10.83 inductively, first applying it to the ideal a with 
F = F,, then applying it to the ideal a + (F1) with F = Fo, and so on. This 
proves the first conclusion, and the second conclusion follows because of the 
convention that the empty set has dimension —1. 


Corollary 10.85. Over an algebraically closed field any system of homoge- 
neous polynomial equations with more variables than equations has a nonzero 
solution. 


PROOF. Let there be r equations and n + 1 variables with n + 1 > r, the 


equations being F, = 0,..., F, = 0. The zero locus for each equation is a subset 
of P”. Applying Corollary 10.84 with a = 0 shows that dimV(F,,..., F,) = 
n—r > Oand that V(F),..., F-) is not empty as long asn > r. 


Corollary 10.85 is the result in the present chapter that was anticipated in 
Problem 23 at the end of Chapter VIII. 


638 X. Methods of Algebraic Geometry 


13. Schemes 


We conclude with some commentary about “schemes.” The subject of algebraic 
geometry studied along the lines of Sections 1-12 suffers from at least two 
shortcomings. One concerns the coefficients that are involved. The original 
impetus for the subject came from systems of polynomial equations in several 
variables. These equations involve addition, subtraction, and multiplication, and 
the requirement that division be allowable is unnatural and cuts down the scope 
of the subject. It immediately cuts out Diophantine equations, for example, to 
say nothing of congruences modulo prime powers. It would be more natural to 
allow the coefficients to lie in any commutative ring with identity. The other 
shortcoming is that the definition of variety depends on an embedding whose 
chief role is to get past the stage of making definitions; soon the embedding is 
stripped away, and the interest is in varieties up to isomorphism. The situation 
is similar to the historical treatment of groups and of manifolds. Groups were 
for the most part originally conceived in terms of group actions, but eventually 
the groups were separated from the actions. Manifolds at first were defined as 
certain subsets of Euclidean space, but eventually they were given an intrinsic 
definition. It would be more in keeping with the wisdom gained from other areas 
of mathematics if varieties could be defined intrinsically right away. 

Schemes, introduced and developed by A. Grothendieck in the late 1950s and 
early 1960s, accomplish both these objectives. The theory of schemes borrows 
ideas and techniques from many areas of mathematics, as will be apparent shortly. 
This section will briefly present some of the definitions, offer some examples, and 
show the sense in which varieties may be regarded as schemes.”° The interested 
reader may want to read more, and this section will therefore conclude with some 
bibliographical remarks. 


1. Spectrum. One preliminary remark is necessary. To isolate an affine 
variety from its ambient space A”, we can take advantage of Proposition 10.23, 
which says that the points of the variety correspond exactly to the maximal ideals 
of the affine coordinate ring.*’ The set of maximal ideals in a ring, however, 
is usually not an object that lends itself to use with mappings. For example the 
canonical inclusion of Z into Q is not reflected in any of the mappings of the 
singleton set {(0)} of maximal ideals of Q into the set of maximal ideals of Z. 
Instead, the theory of schemes works with prime ideals. These behave nicely 
in that the inverse image of a prime ideal under a homomorphism of rings with 
identity is a prime ideal. 


©The material in this section is based in part on lectures by V. Schechtman given in 1991-92 
and in part on the books by Gunning, Hartshorne, and Shafarevich in the Selected References. 

27 Readers familiar with some functional analysis will recognize that a similar thing happens with 
compact Hausdorff spaces; by a theorem of M. Stone, the points of the space correspond exactly to 
the maximal ideals of the algebra of continuous complex-valued functions on the space. 


13. Schemes 639 


Thus we work with the category of commutative rings with identity, the mo- 
tivating example being the affine coordinate ring of an affine variety over an 
algebraically closed field. If A is a ring in this category, the spectrum of A is the 
set Spec A of prime ideals of A. For example the spectrum of a field consists of 
the one element (0), that of a discrete valuation ring consists of 0 and the unique 
maximal ideal, that of a principal ideal domain consists of 0 and the principal 
ideals (f) such that f is an irreducible element, and that of CLX, Y] consists of 
the ideal (0), the maximal ideals corresponding to one-point sets in C?, and all 
prime ideals (f(X, Y)) of irreducible affine plane curves over C. 

The spectrum of A is understood to carry along with it two additional pieces 
of structure. The first piece of structure is an analog for Spec A of the Zariski 
topology.”® To each ideal a of A, we associate the subset V(a) C Spec A of all 
prime ideals p with a C p. The sets V(a) are easily seen to have the defining 
properties of the closed sets of a topology, and this topology will always be 
understood to be in place. It is immediate from the definition that V(a) = V(./a) 
for every ideal a. One checks for any prime ideal p that V(p) = {p}; consequently 
the one-point set {p} is closed if and only if p is a maximal ideal. 

At least when A is Noetherian, Spec A is a Noetherian space, and a notion 
of dimension (not necessarily finite) is defined for each closed set in the usual 
way”’ as in Section 2; for A itself this coincides with the Krull dimension of 
A. In this situation the irreducible closed sets are the sets V(p) with p prime. 
The fact that such a set is irreducible follows from the identity V(p) = {p}; the 
converse assertion follows from the identity V(a) = V(./a) and the Lasker— 
Noether Decomposition Theorem (Problem 14 at the end of Chapter VII). By 
Proposition 10.5 every closed set is a finite union of irreducible closed sets, and 
thus we have a complete description of the closed sets. For example, in a principal 
ideal domain the closed sets consist of the finite sets of nonzero prime ideals, as 
well as the set of all prime ideals. For the ring A = CLX, Y], every proper closed 
set of Spec A is a finite union of singleton sets {(X — xo, Y — yo)} and of sets 


(f&Y)}U UU (X—x0, ¥ — yo)} 
f(%0,¥0)=0 
with f(X, Y) irreducible. 
If g : A > B is a homomorphism in our category of rings (always assumed 
to carry | to 1) and if p is a prime ideal in B, then y~!(p) is a prime ideal in A. 
Thus the definition “g(p) = g~!(p) gives us a function “y : Spec B — Spec A. 
If E is a subset of A, then we readily check that 


(9) '(V(E)) = Cg)" (Cp | p 2 E}) = {q | “9(q) 2 E} = VV()), 


28 little care is needed with the definitions when A is the 0 ring, which has an identity but no 
prime ideals. Then Spec A is empty, but we will want to allow it as part of the theory. So we need 
to allow the empty set as a topological space. 

°The general theory treats dimension as defined even when A is not Noetherian, but it will be 
enough in this section to consider only the Noetherian case. 


640 X. Methods of Algebraic Geometry 


from which it follows that “g is continuous. The function “g can be fairly subtle. 
For example, if g is the inclusion of Z into the ring R of algebraic integers in 
a number field and if P is a nonzero prime ideal in R, then “g(P) = PM Zis 
the corresponding prime ideal (p) in Z; the continuity of “g implies that each 
nonzero prime ideal (p) of Z arises in this way from only finitely many ideals P 
in R. 


2. Structure sheaf. The second piece of additional structure carried by the 
spectrum of A is its “structure sheaf;’ which is a certain specific sheaf with 
base space Spec A. Sheaves were introduced by J. Leray in 1946 in connection 
with partial differential equations and by K. Oka and H. Cartan about 1950 in 
connection with the theory of several complex variables. As with vector bundles, 
sheaves may be viewed as having a base space carrying some topological infor- 
mation and fibers carrying some algebraic information; local sections will be of 
great interest. The initial example of a sheaf in several complex variables is the 
“sheaf of germs of holomorphic functions” on an open set in C”, germs being 
defined for holomorphic functions on an open set in the same way as they were 
defined in Section 4 for rational functions on a quasi-affine variety. 

We shall define two general notions, “sheaf” and “presheaf;’ and compare 
them. The prototype of a presheaf in several complex variables is the collection 
of vector spaces of holomorphic functions on each nonempty open subset of the 
given open set; the prototype in classical algebraic geometry is the collection of 
regular functions on each nonempty open subset of a quasiprojective variety. In 
the general case, fix a category to describe the allowable structure on each fiber; 
common choices for the objects in this category are abelian groups, commutative 
rings with identity (called “rings” hereafter in this section), and unital R modules 
for some ring. In defining sheaves and presheaves, we shall write the definitions 
using abelian groups, since it is a simple matter to adjoin the additional structure 
when the fibers are rings or modules. 

Let X be a topological space. A presheaf of abelian groups on the base space 
X isacollection {O(U), pyy}, parametrized by the open subsets U of X and the 
open subsets V of U, such that each O(U) is an abelian group, O(@) is the 0 
group, each pyy : O(U) > O(V) is a group homomorphism, each pyy is the 
identity, and pwy pyy = Pwu Whenever W C V CU. Weare to think of O(U) 
as a space of sections of some kind over U and pyy as a restriction map carrying 
sections over U to sections over V. A sheaf of abelian groups on the base space 
X is a topological space O with a mapping a : O — X such that z is a local 
homeomorphism onto, z~!(P) is an abelian group foreach P € X, and the group 
operations on each 2~!(P) are continuous in the relative topology from O. We 
are to think of the elements of a sheaf as germs obtained starting from a presheaf. 
The individual fibers z~!(P) of a sheaf are called stalks. One writes (X, O) for 
the sheaf, sometimes abbreviating the notation to O. 

It is possible to construct a presheaf from a sheaf, and vice versa. If we are 


13. Schemes 641 


given a sheaf O, we define a section s of O over U to be a continuous function 
s :U — Osuch that 7 os = ly. If O(U) denotes the abelian group of sections 
of O over U and if pyy is the restriction map for sections, then {O(U), pyy} is 
a presheaf. In the reverse direction if we start from a presheaf {O(U), pyy} and 
form the kind of direct limit of abelian groups at each point that is suggested by the 
passage to germs, then it is possible to topologize the disjoint union of the abelian 
groups of germs so as to produce a sheaf. Passing from a sheaf to a presheaf and 
then back to a sheaf reproduces the original sheaf. But passing from a presheaf 
to a sheaf and then back to a presheaf does not necessarily reproduce the original 
presheaf. A necessary and sufficient condition on the presheaf {O(U), pyy} for 
{O(U), pyy} to result from passing to a sheaf and then back to a presheaf is that 
the presheaf be complete in the sense that both the following conditions hold: 


(i) Whenever {U;} is an open covering of an open subset U of X and 
f €O(U) is an element such that pu,.u(f) = 9 for all j, then f = 0. 

(ii) Whenever {U;} is an open covering of an open subset U of X and 
fj is given in O(U;) for each j in such a way that py;nu,.u;(fj) = 
Pu;nu,.U. (fx) for all j and k, then there exists f € O(U) such that 
pu,.u(f) = fj for all j. 


The structure sheaf of the spectrum of A is acertain sheaf of rings (Spec A, O) 
with base space Spec A. Just as in the case of regular (= polynomial) functions on 
an affine variety, this sheaf will have the property that the ring of global sections 
is isomorphic to the original ring (cf. Corollary 10.25). We shall describe O by 
describing the presheaf. For each prime ideal p of A, let Ay be the localization of 
A at p, ie., the localization of A relative to the multiplicative system consisting 
of the set-theoretic complement of p. This kind of localization is always a local 
ring. The idea is to define a ring O(U) of regular functions for each open subset 
U of Spec A in such a way that the stalk Op, at the point p ends up being Ap for 
each p. With affine varieties we were able to make the definition directly in terms 
of the function field of the variety, i.e., the field of fractions of A; both O(U) 
and the stalk Op(U) at each point P ended up being subrings of this function 
field. The complication for general A is that we do not have a convenient analog 
of the function field available in which all the localizations are subrings. Thus 
we proceed by imitating the messier equivalent definition of regular function 
given in Proposition 10.28. Namely, for U open in Spec A, let O(U) be the set of 
functions s from U into the product EL ey Ap such that s(p) is in the p" factor Ay 
for each p and such that s is locally a quotient of members of A in the following 
sense: for each p in U, there is to be an open neighborhood V of p within U and 
there are to be elements a and f in A such that for each q in V, the element f is 
not in q and s(q) equals a/f in Aq. (Recall that any element of A not in q defines 
an element in the multiplicative system leading to A,; f is to be such an element 
for each q in V.) The mappings pyy are taken as ordinary restriction mappings, 
and the result is a presheaf. This presheaf is complete, and the associated sheaf 


642 X. Methods of Algebraic Geometry 


is the structure sheaf (Spec A, ©). An affine scheme is any sheaf of rings that is 
isomorphic in a suitable sense to the structure sheaf of some ring. 


3. Scheme. To define “scheme” and the notion that a scheme is defined over 
some ring or some field, we need to back up and say a few more words about map- 
pings in connection with sheaves. A ringed space is a sheaf of rings, (Spec A, ©) 
being an example. Let (X, Ox) and (Y, Oy) be two ringed spaces, and let 
{ev«u«} and {p\,,} be their respective systems of restriction maps. A morphism 
(o, W) : (X, Ox) > (Y, Oy) of ringed spaces consists of a continuous function 
o : X > Yandacollection y of homomorphisms yy : Oy(U) > Ox(a7!(U)) 
such that 


Wy 2 Po-Vio-lU = Pyay ° wu 


whenever U and V are open subsets of Y with V C U. The collection y = {wy} 
yields homomorphisms of stalks wp : Oy.«(p) > Ox,p for each P in X. 

One property of the definition is thatifg : A — Bisahomomorphism of rings, 
then there is an associated morphism (o, w) : (Spec B, Og) — (Spec A, Oa) of 
ringed spaces. The continuous map o : Spec B — Spec A is the map o = “g 
given by “y(p) = gy !(p) for any prime ideal p of B. The mapping y on 
stalks carries Ogspec A.o(p) = Ospec A,g-!(p) tO Ospec B,p and is what is induced 
on the stalk by composition with g. It is not quite true that every morphism 
(o, %) : (Spec B, Og) — (Spec A, Oa) of ringed spaces arises from a ring 
homomorphism. The homomorphism (o, y) of ringed spaces resulting from the 
ring homomorphism ¢ has the property that w carries the maximal ideal M,-1(p) 
of the stalk A,-i() into the maximal ideal My, of the stalk By. A morphism 
(o, w) of ringed spaces whose stalks are local rings is called a local morphism if 
it has this property. With this definition one can show that every local morphism 
of ringed spaces (0, yr) : (Spec B, Og) — (Spec A, Og) arises from some ring 
homomorphism g : A — B. This result is to be compared with Corollary 10.40 
for affine varieties. 

An isomorphism of ringed spaces is automatically local if all the stalks are 
local rings. The reason is that an isomorphism of one local ring onto another 
carries the maximal ideal of the first onto the maximal ideal of the second. Thus 
the earlier definition of affine scheme as a ringed space that is isomorphic to 
some (Spec A, ©) concealed only the rather natural definition of isomorphism of 
ringed spaces, not the more subtle condition “local.” 

A morphism of affine schemes is a local morphism of the affine schemes as 
ringed spaces. Then the classes of all affine schemes and morphisms of affine 
schemes together form a category. A scheme is a ringed space (X, O) such that 
each point of X has an open neighborhood for which the restriction of the ringed 
space to that part of the base is isomorphic to an affine scheme. One can define 
a natural notion of morphism for schemes, and the classes of all schemes and 
morphisms of schemes together form a category. 


13. Schemes 643 


4. Variety as a scheme. Let V be an affine variety over an algebraically 
closed field, and let A(V) be the affine coordinate ring. We have just seen 
how Spec A(V) has the natural structure of an affine scheme. Since Spec A(V) 
includes all prime ideals of A(V), not just the maximal ideals, the continuous 
inclusion V — Spec A(V) is not onto. However, there is a natural relationship 
between the two, and there is a natural relationship between their rings of regular 
functions. The reason is that morphisms of affine varieties correspond exactly (in 
contravariant fashion) to homomorphisms of the affine coordinate rings, which in 
turn correspond exactly to morphisms of affine schemes. From the point of view of 
categories, therefore, the categories of affine varieties and affine schemes match 
perfectly. This description blurs what happens to the underlying algebraically 
closed field of scalars, and one wants to be able to say that the categories of affine 
varieties over k and affine schemes over k match perfectly. Making this statement 
requires an additional construction, which will be sketched in the next subsection. 

This correspondence can be extended suitably from affine varieties to quasipro- 
jective varieties, and the interested reader can find details on page 30 of Volume 2 
of Shafarevich’s books. 


5. Scheme defined over a ring. If A is a ring and (X, Ox) is a scheme, 
then a morphism of schemes (o, y) : X —> Spec A defines a homomorphism 
A — Ox(U) of rings for each open subset U of X. Specifically Wspec 4 carries 
Ospec a(Spec A) = A into Ox (X), and hence py x 0 Wspec 4 Carries A into Ox (U) 
if {ovy} is the system of restriction maps for (X, Ox). The result is that Ox 
becomes a sheaf of A algebras. 

Conversely if Ox is a sheaf of A algebras, then one can construct a morphism 
of schemes X —> Spec A. In this case one says that (X, Ox) is a scheme over 
A. Every sheaf of abelian groups is a sheaf of Z algebras, and thus every scheme 
is a scheme over Z. Schemes over Z are of special interest in number-theoretic 
situations, among others. The schemes produced from varieties in the previous 
subsection are schemes over the underlying field k. The notion of a scheme over a 
field that is not algebraically closed is one way of extending the theory of varieties 
to have it apply when the underlying field is not algebraically closed. 


6. Role of homological algebra. The sheaves of abelian groups over a fixed 
topological space X, with a natural definition of morphism, form a category, and 
one can define kernels and cokernels in this category. The result turns out to 
be an abelian category with enough injectives, and the homological algebra of 
Chapter IV is applicable. If (X, O) is a sheaf over X, then formation of global 
sections, given by (X, O)  O(X), is a covariant left exact functor. Since there 
are enough injectives in the category, the derived functors make sense, and the k"" 
derived functor gives what is called the k sheaf cohomology group H*(X, ©) 
with coefficients in O. This kind of cohomology is easy to use abstractly and hard 
to use concretely, but it can be shown to be isomorphic to other more concrete kinds 
of cohomology. In this way the cohomology of sheaves leads to generalizations 


644 X. Methods of Algebraic Geometry 


of Euler characteristics and Betti numbers that have significance in number theory 
and geometry. 

In applications, there tends to be a ringed space (X,‘R) (maybe a scheme) 
in the picture, and the sheaves (X, ©) often have the property that each stalk 
of O is a module for the corresponding stalk of 7. Then the above kind of 
theory is applicable for sheaves that are 7 modules in this sense, not merely 
sheaves of abelian groups. The interested reader can find details in Chapter II of 
Hartshorne’s book. 


BIBLIOGRAPHICAL REMARKS. The topic of schemes assumes knowledge of a 
certain core of algebraic geometry and commutative algebra, and it builds on more 
commutative algebra as it goes along. Some books mentioned in the Selected 
References that include algebraic geometry at the beginning level are those of 
Hartshorne (Chapter I), Harris, Reid, and Shafarevich (Volume 1). All these 
books have many geometric examples; this is particularly so for the book by 
Harris. Some books on commutative algebra are the ones by Atiyah—Macdonald, 
Eisenbud, Matsumura, and Zariski-Samuel. These lists are by no means exhaus- 
tive. There are in fact hundreds of books on the two subjects. To get a list of many 
of the ones in commutative algebra, one can search in the Library of Congress 
catalog at http: //catalog.loc.gov, using the call number QA251.3; a few 
additional ones are sprinkled in among books with call number QA251. For 
books on algebraic geometry, one can search using the call number QA564. 

The book by Eisenbud—Harris on schemes is an introductory one written 
in a style that makes it comparatively easy for the reader to get an overview 
of the subject. Two older books on schemes are the ones by Macdonald and 
Mumford. Hartshorne’s book introduces schemes in Chapter IT, and Volume 2 of 
Shafarevich’s books is on that topic. The end of Volume 2 of Shafarevich’s books 
contains a 20-page historical sketch of algebraic geometry, including discussion 
of some of the precursors of the subject of schemes. 


14. Problems 


In all problems, k is understood to be an algebraically closed field. 
1. If P isin P”, show that the ideal J(P) of members of k[Xo, ..., X,] vanishing 
at all points (xo, ..., Xn) in ete {0} with [xo, ..., Xn] = P is homogeneous. 
2. Let X be a Noetherian topological space. 
(a) Prove that X is compact. 
(b) Prove that every irreducible closed subset of X is connected. 
3. (a) Prove that the image of a quasiprojective variety V under a regular function 
f :V — A! is connected. 
(b) Prove that if V is a projective variety and g : V > A” is a morphism, then 
y(V) is a one-point set. 


14, Problems 645 


4. Let U be the quasi-affine variety U = A? — {(0, 0)} in A*. Prove that O(U) = 
kLX, Y]. 

5. Deduce from the previous problem, Corollary 10.25, and Theorem 10.38 that UV 
is not isomorphic to an affine variety. 


6. Prove that a rational map of an irreducible curve into an irreducible curve is 
dominant or is constant. 

7. Let g : U — V bea dominant morphism between quasiprojective varieties. 
Prove that the induced mapping of local rings gp : Og(p)(V) > Op(U) given 
in Proposition 10.42 is one-one. 


8. Let V be the affine variety V = V(WX — YZ) in A‘, let A(V) be the affine 
coordinate ring k[W, X, Y, Z]/(WX — YZ), let X and Y be the images of X 
and Y in A(V), and let f = X/Y in the field of fractions of A(V). Prove that 
there exist no members @ and b of A(V) with f = @/b and b(w, x, y, z) £0 
whenever wx = yz and one or both of w and y are nonzero. 

9. Let U and V be quasiprojective varieties, and let g : U — V be a function. 
Suppose that U and V are unions of nonempty open subsets U = Lie 7 Uq and 
V= mers Vq such that p(Ua) C Vy for all a. Prove that g is a morphism if 
and only if each gy : Uy — Vy is a morphism. 

10. This problem concerns local extensions of regular functions from quasiprojective 
varieties to open sets in the ambient affine or projective space. 

(a) Let V be an affine variety in A”, let U be a nonempty open subset of V, let 
f be in O(U), and let P be a point in U. Prove that there exist an open 
neighborhood Up of U about P in V, an open set U in A”, and a function 
F in O(Up) such that Up = V 1 Uo and such that F is an extension of f | Uo" 

(b) Extend the result of (a) to make it valid for any quasiprojective variety V in 
Pp’. 

11. Suppose that X and Y are quasiprojective varieties, that U and V are irreducible 
closed subsets of X and Y, respectively, and that g : X — Y is a morphism such 
that g(U) C V. Prove that g : U > V isa morphism. 

12. Prove that 
(a) the mapping ¢ : p:-l_, p” given by g([x0, .--, Xn»—-1]) = [Xo0,.--, Xn—1, 9] 

is an isomorphism of P”~! onto the projective hyperplane H,, corresponding 
to the homogeneous ideal (X,,) of k[Xo, ..., Xn], 

(b) any projective variety V in P” that lies in H, is isomorphic to a projective 
variety in P’~!, 

(c) any projective variety V in P” is isomorphic to a projective variety V’ in 
some P’ withr < n that is not contained in any projective hyperplane defined 
by a homogeneous ideal (X;) of k[ Xo, ..., Xr]. 


Problems 13-16 relate the classical condition for detecting a singularity in the affine 


646 X. Methods of Algebraic Geometry 


case to the corresponding condition in the projective case. The key is an identity 
traditionally known as Euler’s Theorem that is proved as Problem 3 at the end of 
Chapter VIII. In these problems it is assumed that F),..., fF are homogeneous 
polynomials in k[Xo0, ..., X»], that P = [xo, ..., X,] isa point in P” in their common 
locus of zeros, and that P is in the image of A” under Bo, i.e., that x» 4 0. Define 
Sis---> fy in k[Xy,..., Xn] by ff(X1,..., Xn) = FC, X1,..., Xn). 


13. Define J(F)(x(, ...,x/,) to be the r-by-(m + 1) matrix whose (i, 7)" entry is 


alee) for 1 <i <rand0O < j <n, and define J(f)(x},...,x/,) to 
be the r-by-n matrix whose (i, j)" entry is of (xj,-..,X/,) for | <i <r and 


J 
1 < j <n. Prove that rank J(F)(xj,...,x/,) = rank J(F)(AxG,..., Ax),) for 
all A € k*. 


14. With notation as in Problem 13, prove that the r-by-n matrix J(f 4, 34004) 
equals the r-by-n matrix obtained by deleting the 0" column of the r-by-(n + 1) 
matrix J(F)(1,¥j90%<4X_)> 


15. Using Euler’s Theorem (Problem 3 at the end of Chapter VII), prove concerning 
the point P on the locus of common zeros of F),..., F; that the 0" column of 
the matrix J(F)(xo, ..., Xn) is a linear combination of the other columns of the 
matrix. 


16. Deduce for the point P on the locus of common zeros of Fi, ..., F;- that 
rank J(F)(x9, %1,.--,%,) = rank J(f)(x1/X0, ..-, Xn /Xo). 


Problems 17-22 concern products of quasiprojective varieties. The Segre map- 
ping o : P” x P” > PX with N = mn +m +n was defined in Section 8 by 
a ([xo, ---,Xm], Lyo,---, yal) = [Wo0, ---, Wn] with w;; = x; y;. Let us abbreviate 
[woo,.--, Wmn] as [{wij}] and k[ Woo, ..., Wann] as k{ Wi; }). 


17. Prove that o is well defined and one-one. 


18. Every member [{wj;}] of imageo has wjjwy = wi wz; for alli, j,k, 1. Prove 
conversely that every member [{w;;}] of P” with w;;wy = wy w,;; foralli, j,k, 1 
is in image o,, and deduce that imageo = V (a), where a is the ideal in k[{W;;}] 
generated by all Wi; Wis — Wii Wx;- 


19. This problem will prove that a is a prime ideal, and in particular it will follow 
that V (a) is irreducible. Let g : k[{W;;}] > k[Xo, ..., Xm, Yo, ..., Yn] be the 
substitution homomorphism given by setting W;; = X; Y;. Then ker ¢ is an ideal 
containing a. 

(a) By introducing a suitable monomial ordering in k[{W;;}], show that any 
monomial in k[{W;;}] of total degree d is congruent modulo a to a monomial 
of total degree d of the form M = |] 3 W;,! having the property that a;; > 0 
implies that ax; = 0 for all (k,/) with / > j and k > i. Call a monomial of 
this form reduced. 


20. 


21. 


22. 


14, Problems 647 


(b) Suppose that M = [|]; ; W;, and M’ = []; ; we are two distinct reduced 
monomials. By considering the first W;; for which a;; 4 b;;, prove that 
g(M) # o(M’). 

(c) Deduce that ker g = a, and show why it follows that a is prime. 

Let p be a prime ideal in k[Xo0,..., Xm], and let R = k[Xo, ..., Xm]/p be the 

quotient. 

(a) Prove that the ideal p k[Yo,..., Y,] in k[Xo,..., Xm, Yo, -.-, Yn] generated 
by all products of members of p and polynomials in Yo, ..., Y, is prime. 

(b) By following the substitution homomorphism 


k[{Wij}] > KLXo,...,Xm+Yo.---s Yn] 


with a substitution homomorphism k[Xo0, ..., Xm, Yo,---; Yn] > RIZ], 
prove that whenever U is a projective variety in P” and P is a point in P”, 
then o(U x {P}) is a projective variety in P’. 

Let U and V be projective varieties in P” and P”, respectively. Problem 20 

shows that o(U x {v}) is a projective variety in P’ for each v € V. Suppose 

that o(U x V) isa union E, U E> of two closed sets in P’. 

(a) For i equal to 1 or 2, define Vij = {v € V | o(U x {v}) Z E;}. Why is 
Vi N V2 = @? 

(b) Prove that V; and V2 are open by using bihomogeneous polynomials to 
exhibit each of V; and V2 as a neighborhood of each of its points. 

(c) Deduce from (b) that o(U x V) is a projective variety in Py. 

(d) Show how to deduce from (c) that if U and V are quasiprojective varieties 
in P” and P”, respectively, then o(U x V) is a quasiprojective variety in 
EY, 

(a) Prove that if U and V are quasiprojective varieties, then the projections of 
U x V to U and V are morphisms. Here the projection of U x V to U is 
understood to be the map o(u, v) + u of o(U x V) into U, and similarly 
for the projection to V. 

(b) Ifg: U > X andy: U — Y are morphisms, prove that (g, Ww): U > 
X x Y when defined by (9, ¥)(u) = (gu), W(u)) is a morphism. 

(c) Ifg:U > X andy: V — Y are morphisms, prove thatg x y:UxV— 
X x Y when defined by (g x w)(u, v) = (g(u), W(v)) is a morphism. 


Problems 23-25 make some observations about prime ideals and irreducible 
polynomials. 


23: 


Let J = (fi,..., f-) be an ideal in k[X, Y] such that the zero locus V(/) is 
irreducible and such that f|,..., f are irreducible polynomials. 

(a) Prove that J is prime if dim V(/) = 1. 

(b) Give an example to show that J need not be prime if dim V(J) = 0. 


648 X. Methods of Algebraic Geometry 


24. Fix a monomial ordering for k[X1,..., X,], and let J be a nonzero ideal in 
k[X1,..., Xn]. Prove that if 7 is prime, then the members of any minimal 
Grobner basis of J are irreducible polynomials. 

25. Suppose that char(k) 4 2. Within k[X, Y, Z], let E be the homogeneous sub- 
space k[X, Y, Z]2. The six monomials in FE forma k basis of E and may be used 
to identify E with k°. Under this identification prove that the subset of reducible 
polynomials in E, including the 0 polynomial, is an affine hypersurface of k°. 


Problems 26-35 concern elliptic curves. An elliptic curve over k is a pair (E, O) 
consisting of a nonsingular irreducible projective curve E of genus | and a distin- 
guished point O. These problems use the Riemann—Roch Theorem and its associated 
notation in Chapter IX in order to exhibit a concrete realization of such a curve in 
P* with O on the line at infinity and with all other points of E in A*. Such a curve 
has a remarkable structure; for further information, including further applications of 
the Riemann—Roch Theorem to these curves, see the book by Silverman. Corollary 

10.56 identifies the points of E with the discrete valuations of the function field k(£) 

over E. Let vg be the discrete valuation corresponding to O. 

26. For n > 0, prove that €(nvg) =n. Use this result to find members x and y of 
k(E) whose divisors satisfy (*)oo = 2vg and (y)oo = 3v0. 

27. Prove that [k(£) : k(x)] = 2 and [k(E) : k(y)] = 3. 

28. Why does it follow from the previous problem that k(Z) = k(x, y)? 

29. From the fact that £(6v9) = 6, deduce a nontrivial linear dependence over k 
among the members 1, x, y, baa xy, a x? of k(E). Show that the coefficients 
of y* and x? are necessarily nonzero, and then scale x and y appropriately to 
show that the image of the function gy : E — {0} > P* defined by g(P) = 
[x(P), y(P), 1] is contained in the projective closure C of the zero locus of the 
polynomial f(X, Y) = (Y* +a, XY + a3Y) — (X? +a X* +.a4X +6). 

30. Prove that f(X, Y) is irreducible and that C is therefore a projective curve. 

31. Why is g : E — {0} ~ C amorphism? Why does it follow that g extends to a 
morphism ®: E > C? 

32. Deduce from Problem 28 that © is birational. 

33. Show that C is nonsingular at its point at infinity. 

34. Show that if C is singular at (xo, yo) in A’, then the member of k(E) given by 
z = (y — yo)(x — x9)! has vg(z) = —1 and vp(z) > 0 for all P in E — {O}. 


35. Deduce from Problems 33 and 34 that C is nonsingular, and explain why it 
follows that ® : E — Z is an isomorphism. 


HINTS FOR SOLUTIONS OF PROBLEMS 


Chapter I 


1. We are interested in odd p’s such that C= = +1. Factorm as]; ee Then qua- 


dratic reciprocity gives (= I]; ( i) Ki =lk, odd (f Ft) = Th, odd (— 1) 1- DED J. 
We consider p = | mod 4 and p = 3 mod ‘4 aeparaisly. For Pp = 1 mod 4, the set 
in question consists of those p’s for which (F -) is —1 for an even number of those 
k;’s that are odd. This is the union over all such systems of minus signs of the 
miteraection over j of the finitely many arithmetic progressions for which the residue 
(7) equals the j" sign. Fora single system of minus signs, the result is an arithmetic 
progression of the form k J] kj odd Pj + b by the Chinese Remainder Theorem. Each 
of these contains a nonempty set of primes by Dirichlet’s Theorem, and hence P is 
nonempty. 

For p = 3 mod 4, if Tk, saa) 22 is +1, then the set in question is of the 
same form as above. If [| kj oda (— 120i ~) is —1, then the set in question consists of 
those p’s for which (- -) i is —1 for an odd number of those k;’s that are odd, and this 
again is the finite anion of arithmetic progressions. 

2. For (a), the proof of necessity of Theorem 1.6b remains valid when the prime p 
is replaced by the integer m. For (b), the first paragraph of the proof of the sufficiency 
of Theorem 1.6b handles matters if m is odd. 

3. For D = —56, H has order 4, but H’ has order 3 because 3x* + 2xy + Sy? 
are improperly equivalent but not properly equivalent. A 3-element set has no group 
structure such that a 4-element group maps homomorphically onto it. 


4. For (a), the product of any two integers representable as ax* + bxy + cy? is 
representable by the class of the square, which is the class of the inverse because the 
class is assumed to have order 3. The class of the inverse is the class of (a, —b, c), 
and this represents the same integers as (a, b, c). 

For (b), we seek reduced triples. These are (a, b, c) with |b| < a < c and with 
b* — 4ac = D = —23, and we know that 3ac < |D| and that b has the same 
parity as D. Hence bD is odd, and the inequalities 3b2 < 3a? < 3ac < 23 show 
that |b] = 1. For |b] = 1, we have 1 — 4ac = —23 and ac = 6. Sincea < c, the 
possibilities with |b] = 1 are (1, +1, 6) and (2, +1, 3). Since (1, 1, 6) and (1, —1, 6) 
are properly equivalent by Proposition 1.7, |b| = 1 leads to just the three possibilities 
(1, 1,6), (2, 1,3), and (2, —1,3). Proposition 1.7 shows that these lie in distinct 
proper equivalence classes, and thus h(—23) = 3. 


649 


650 Hints for Solutions of Problems 


For (c), the general theory shows that (1, 1, 6) corresponds to the identity class, 
and therefore the other two reduced forms are in classes of order 3. 

For (d), we first track down what happens to the forms. If we write ~ for proper 
equivalence, then we have 


(2, 1, 3)(2, 1,3) ~ 2, 1,3)@G, -1, 2) ~ @, 5, 6)G, 5, 4) 
= (6,5, 2) ~ (2, —5, 6) ~ (2, —1, 3), 


and the last form is improperly equivalent to (2, 1, 3). The next step is to interpret this 
chain with actual variables. If the initial variables are x;, y1, x2, y2, then the change 
at the first step from (2, 1,3) to (3, —1,2) comes from x2 = y5, y2 = —x4 while 
leaving x; and y; unchanged as x; = x}, y) = yj. The change at the second step 
from (2, 1, 3) to (2, 5, 6) and from (3, —1, 2) to (3, 5, 4) comes from the translations 
x, =x{tyl. yy = yi xh =x +y4, y5 = yy. The multiplication step comes from 
Proposition 1.9 and is given by x3 = xx —2y/y5 and y3 = 2x y5+3xy/+5y/y5. 
And so on. The final result is that 


(2x? + x1y1 + 3y?) (2x3 + xoyo + 3y3) = 2X? + XY + 3Y?, 


where X = x1(—x2 + y2) + yi (x2 + 2y2) and Y = yi (x2 — yo) + x1 (x2 + yr). 


5. The equality ‘es _) is sy C ce) = (Ss ) shows this. 


6. For reduced forms we seek (a, b, c) witha > 0,c > 0, |b] < a < c. We know 
that 3ac < |D| = 67, and D odd implies b odd. From 3b? < 3a? < 3ac < 67, we 
obtain 3b? < 67 and |b| < 4. So |b] is 1 or 3. For |b] = 1, 3(b?— D) = 4 (b? +67) = 
17; then 17 = ac, anda = 1 andc = 17. Since (1, 1, 17) is properly equivalent 
to (1, —1, 17) by Proposition 1.7, we obtain only one proper equivalence class from 
this pair. For |b] = 3, 3(b? — D) = 49 + 67) = 19 forces ac = 19 and thena = | 
and c = 19. Then |b| < a is not satisfied. So |b| = 3 gives no proper equivalence 
classes, and h(—67) = 1. 


7. The 6 cycles are 


(1,8, —15), (—15, 7,2), (2,7, —15), (—15, 8, 1); 
(—1, 8, 15), (15,7, —2), (—2,7, 15), (15, 8, —1); 
(3, 8, —5), (—5, 7, 6), (6,5, —9), (—9,4, 7), (7,3, —10), (—10, 7, 3); 
(—3, 8,5), (5,7, —6), (—6,5,9), (9,4, —7), (—7,3, 10), (10, 7, —3); 
(5,8, —3), (—3, 7, 10), (10,3, —7), (—7, 4,9), (9,5, —6), (—6, 7, 5); 
(—5, 8,3), (3,7, -10), (—10,3, 7), (7,4, —9), (—9, 5,6), (6,7, —5). 


8. The form (1, 1, 12) corresponds to the identity class, the classes of (2, +1, 6) are 
inverses of one another, and the classes of (3, +1, 4) are inverses of one another. The 


Chapter I 651 


group structure has to be cyclic, and any element other than the identity can be taken as 
a generator. Let us take a to be the class of (2, 1, 6). We are to identify a’. The form 
(2, 1, 6) is aligned with itself (having the same b component), it has j = 6/2 = 3, 
and the composition formula of Proposition 1.9 leads to (2- 2,1, 7) = (4, 1,3). 
This is properly equivalent to (3, —1, 4), and we do not have to follow through the 
algorithm of Theorem 1.6a to identify the product in our list. The result is that 
a = (2,1,6),a? = (3, -1,4),a = (a*)! & (3,1,4), a4 =a! © (2,-1,6), 
and a> = 1 < (1, 1, 12). 

10. For (a), the result is known for n prime by Theorem 1.2. By induction and 
the definition of the Jacobi symbol, it is enough to handle n = ab when a and 
b can be handled. We have 5(n — 1) = 3(ab— 1) = $b(a — 1) + 5(b- 1) 
= 5(a -1)+ 5(b — 1) mod 2, the last step following because b is odd. Therefore 
(—1)2@-D = (-1)2@-Dt36@-) — (=) (=) — (=), the last step following by 
Problem 9a. 

For (b), we argue similarly, and the key computation is g(n? -l)= g(a7b? -D= 
gb? (a? — 1) + x(b" — 1) = x(a? — 1) + 3(b* — 1) mod 2, the last step following 
because b? is odd. 


11. Allowing primes to appear more than once, write factorizations of m and n as 
m = []j; pi andn = []_, q;. Then Theorem 1.2 gives (“) = Tai zi ) ce 
ee i= ea (—1)2%-D2@—-D = (7) (-1)2i=1 dia 2(Pi—Daqi-D) Since 


Djat Liat 20P1 — D3@j— DY) =[Vjar2@ — D][ Liar 2 - ] 


and since )*5_, (qj —1) = 3(n—1) mod 2 and )7}_, (pi —1) = 3(m—1) mod 2 
by the same argument as in Problem 10a, the required formula follows. 


12. For (a), choose by Dirichlet’s Theorem a sufficiently large prime p that is 
= 3 mod 8 and is in particular = 3 mod 4. If 8 divides |G|, then the fact that |G| 
divides p + 1 implies that 8 divides p + 1. So p = —1 mod 8. Since p was chosen 
with p = 3 mod 8, this is a contradiction. So 8 cannot divide |G]. 

For (b), choose by Dirichlet’s Theorem a sufficiently large prime p that is = 
7 mod 12 and is in particular = 3 mod 4. If 3 divides |G], then 3 divides p+ 1. Thus 
p = —1mod3. Since also p = 3 mod 4, p = 11 mod 12. But p was chosen with 
p =7 mod 12. This is a contradiction, and 3 cannot divide |G|. 

For (c) with an odd prime q > 3 given, choose by Dirichlet’s Theorem a sufficiently 
large prime p that is = 3 mod 4q and is in particular = 3 mod 4. If qg divides |G], 
then g divides p + 1, and p+ 1 =0 mod q. Meanwhile, p = 3 mod 4q implies that 
p+1=4 mod 4q and p + 1 = 4 mod g, contradiction. So g cannot divide |G|. 

13. For (a), choose by Dirichlet’s Theorem a sufficiently large prime p that is 


= 5 mod 12 and is in particular = 2 mod 3 and = | mod 4. If 4 divides |G|, then 4 
divides p + 1, which is = 2 mod 4. So 4 cannot divide |G]. 


652 Hints for Solutions of Problems 


For (b), choose by Dirichlet’s Theorem a sufficiently large prime p that is = 
2 mod 9 and is in particular = 2 mod 3. If 9 divides |G|, then 9 divides p + 1, which 
is = 3 mod 9. So 9 cannot divide |G|. 

For (c) with an odd prime q > 3 given, choose by Dirichlet’s Theorem a sufficiently 
large prime p that is = 2 mod 3g and is in particular = 2 mod 3. If qg divides |G], 
then g divides p + 1, which is = 3 mod 3q and hence is = 3 mod q. So q cannot 
divide |G]. 

14. The integers in (a, r) are exactly the multiples of a, since such an integer n has 
to be of the form n = ca + dr for integers c and d. This equation says that n = ca 
and 0 = dr, since | andr are linearly independent over Q. The integer N(s) = sa(s) 
is in J because s is in J and o(s) is in R, and thus N(s) has to be a multiple of a. 

15. Write J = (a,r) witha > 0 an integer andr in J by Lemma 1.19b. As in the 
previous problem, the integer a is characterized uniquely in terms of J as the least 
positive integer in 7. Putr = b + gé for suitable integers b and g. Without loss of 
generality, we may assume that g > 0. Using the division algorithm and possibly 
replacing b by b — na for some integer n, we may assume that 0 < b <a. 

With these conventions in place, let us see that g necessarily divides a. The fact 
that aé has to be in J means that aé has an expansion ad = cja + c2(b + gd) with 
integer coefficients. Then ad = c2g6, and g must divide a. 

In particular, 0 < g < a is forced. To see that b and g are uniquely determined, 
let {a, b’ + g’6} be another such Z basis. Since b’ + 9/5 = cja+c2(b+ g6) and since 
symmetrically we have b + g6 = cia + c5(b' + g'5), we obtain g’ = cog = c2c4g". 
Therefore |c2| = 1. Meanwhile, we must have 


Cja+cb=b' and = cng = g'5. 


The second of these equations shows that cz > 0. Thus cz = 1. Finally cya = b' —b 
withO < b < aand0 <b! < a forces b’—b = 0. Therefore a, b, and g are uniquely 
determined. 

To complete the proof, we need to see that g divides b and that ag divides N (b+ 5). 
Since aé is in I, ad = cja + ch(b + gd). Hence cjg = a and c{a + chb = 0. 
Substituting the first of these equations into the second gives c/cyg +c b = 0. Since 
cy # 0 from the equality cl g = a, c/g +b =0. Thus g divides b. 

To see that ag divides N(b + gd), we use the fact that ga (5)(b + gd) is in J to 
write bga (6) + d0 (8) g? = djag + dog(b+ gé) for some integers d; and dj. Then 
N(b + g6) = b* + bg(6 +. o(8)) + 60 (8)g? = b* + bgd + diag + dog(b + g6). 
Equating coefficients of 5 and | gives 


0 = bg + dog” and N(b + 8) =b? + diag + dxbg. 


Since g > 0, the first of these equations gives d,) = —bg~!. Substituting into the 


second equation gives 
N(b + g6) =b° + diag — (bg ')bg = diag, 
and we see that ag divides N(b + g6). 


Chapter I 653 


16. We are to show that Za + Z(b+ g6) is closed under multiplication by arbitrary 
members of R. It is enough to treat multiplication by 1 and by 6. There is no problem 
for 1. Since 6+ (6) is in Z, it is enough to show that there exist integers c;, c2, dj, do 
with 


da =cja+c2(b+ gd) and 0 (6)(b + gd) = dja + do(b + gd). 
In view of the assumed divisibility, we can put c2 = ag, q= =fgo*: dy = Spar? F 
and d; = N(b+ gd)(ag)!. Then the first equation is certainly satisfied, and the 
question concerning the second equation, once we have multiplied it by g, is whether 
we have an equality 


ga(6)(b + 96) = N(b +. g6) — b* — begs. 


The left side is N(b + g5) — b(b + g6), and thus equality indeed holds. 


17. From Section 7 the relevant formula is N(J) = |VD |“! |ryo(r2) — 0 (11)r2|- 
Here we can take rj = a andr2 = c + dod. Substitution gives 


N(D) =|VDI"!Jallo(c + dd) — (c + d8)| 
= |VD|"Jal|c + do (8) — c — d8| = |\VD | |ad||o (5) — 4}. 


The expression |D |~!|o(6) — 6| arose in Section 7 in the computation of N(R) 
and was shown to be 1. Thus N(/) = |ad|. 


18. For (a), the algorithm of Section IV.9 of Basic Algebra shows how to align 
matters so as to compute the quotient of a free abelian group by a subgroup when 
the subgroup is given by generators. The given relationship between the generators 
a and b + gé of Problem 15 with the Z basis of R is 


(ss) = (65) (3) 

b+g5}) ~\bog)\5)° 

The procedure is to do row and column operations on the coefficient matrix to bring 
it into diagonal form. Since g divides b, a column operation replaces the b by 0. 
We obtain a diagonal matrix with diagonal entries a and g, and the quotient group 
is identified as (Z/aZ) © (Z/gZ). Thus ag is identified as the number of elements 
in the quotient group R/J. Problem 17 identified ag as N(J), and thus N(/) is the 
number of elements in R/T. 

For (b), the inclusion J C J induces a quotient mapping of the finite group R/J 
onto R/J. As a homomorphic image of R/I, R/J must have an order that divides 
the order of R/T. In view of (a), N(J) divides N(). The equality 7 = J holds if 
and only if the quotient mapping is one-one, and this happens, because of the finite 
cardinalities, if and only if N(J) = N(J). 


654 Hints for Solutions of Problems 


19. The relevant arguments for the first three parts of this problem already appear 
in Chapters VUI and IX of Basic Algebra, and thus we can be brief. For (a), the 
Chinese Remainder Theorem (Theorem 8.27 of Basic Algebra) shows that R/IJ = 
R/I x R/J, and then NU J) = N(I)N(J) by Problem 18a. For (b), the in- 
ductive argument for (*«*) in the proof of Theorem 9.60 of Basic Algebra shows 
that dimz/pz R/P° = ef, and thus |R/P*| = p°!. For (c), Corollary 8.63 of 
Basic Algebra and Problem 18a above together show that N(J) = Tj N (PP ) if 
f= Tj PO is the unique factorization of the ideal 7. Since N (ey) =WN (P;)%i 
by (b), NU) = Tj N (P;)i , and (c) follows immediately. 

For (d), we use Problem 15 to write J = (a, b + g65); then 


Io (I) = (@’, a(b + g5), a(b + go(8)), N(b + 88). 


Each of the generators on the right side lies in the principal ideal (ag). In fact, a? is in 
(ag) because g divides a, a(b + gd) and a(b + go(4)) are in (ag) because g divides 
b, and N(b + gé) is in (ag) because ag divides N(b+ g65). Therefore Jo (J) € (ag). 
Since N(J) = ag by Problem 17, Problem 19c shows that NUIo(1)) = N((ag)). 
Then Jo (1) = (ag) = (N(J)) by Problem 18b. 

20. The only ideal J with N(J) = 1 is J = R. Problem 19c therefore shows that a 
nontrivial factorization of (p)R leads to a nontrivial factorization of its norm, which 
is p*. This factorization must be p* = p - p, and thus / factors nontrivially at most 
into two factors, each with norm p. 


21. For (a), we use Problem 15 to write a nontrivial factor J of (2)R as J = 
(a,b + gd). Problem 17 shows that 2 = N(/) = ag with g dividing a. Therefore 
a = 2 and g = 1. So the only possible factors are of the form J = (2,b + 5) with 
0<b<az=2. Thusb=0Oorb = 1. When D is odd, we have Tr(5) = | and 
N(8) = 4(1—m). Then N(b +8) = b? + bTr(8) + N(S) =? +b4+- 4(1—m) = 
z( —m) mod 2. If m = 5 mod 8, then we see that 2 does not divide N(b + 5), and 
thus (2) R cannot have a nontrivial factor. 

For (b), we again have N(b + 8) = b* + b Tr(5) + N(S) = b? +b4+ (1 —-m)= 
aC —m) mod 2, and the condition m = | mod 8 makes the right side 0. Thus 2 
divides N(b+6), and (2, 5) and (2, 1+) are both ideals by Problem 16. The product 
of these ideals is (2, 6)(2, 1 +6) = (4, 26,2(1 + 5), 6) and contains (2)R because 
2 = 2(1 + 5) — 25. Moreover, the product has norm 4 by Problems 17 and 19c, and 
this matches the norm of (2) R. Thus Problem 18b shows that (2, 5)(2, 1+6) = (2)R. 

For (c) and (d), 6 = —./m. Thus N(b + 6) = b* + bTr(6) + N(6) = b* —m = 
b— zD. If D/4 = 3 mod 4, then b — 4D is divisible by 2 forb = 1. If D/4 = 
2 mod 4, then b — ;D is divisible by 2 for b = 0. With b taking on the appropriate 
value in the two cases, (2, b + 4) is an ideal by Problem 16. The square of this ideal 
is (4, 2(b + 8), (b — /m)) = (4, 2(b +8), b? +m — 2mv/b). The definition of b 
makes b? + m even in every case, and hence (2, b + 5)? D> (2)R. Since the norms of 
the ideals on the two sides are both 4, the two ideals must be equal. 


Chapter I 655 


22. Arguing as in the previous problem, we see that any nontrivial factor of (p)R 
must have norm p and therefore must be given by (p, x + 5) for some x such that p 
divides N(x + 5) = x* +x Tr(5) + N(6). 

For (a), Tr(6) = 1 and N(5) = $(1 — m) = [(1 — D), and the condition is that 
p divide x? + x + 4(1 — D). This means that x? + x + 4(1 — D) = 0 mod p 
is to have a solution. When this happens, Problem 16 ensures that (p,x + 4) is 
an ideal. Then (p,x + 0(6)) is an ideal as well, and the product of the two is 
(p?, p(x +5), p(x +a(6)), N(x +4)). Since p divides N(x + 6), this product ideal 
is contained in (p)R. The product ideal and (p)R both have norm p”, and therefore 
they are equal. 

For (b), Tr(6) = 0 and N(6) = —m = —D/4, and the condition is that p divide 
x* — D/4. This means that x? — D/4 = 0 mod p is to have a solution. When this 
happens, Problem 16 ensures that (p, x + 5) is an ideal. Then (p, x + o(6)) is an 
ideal as well, and the product of the two is (ps D(x +4), p(x +.0(6)), N(x + 4)). 
Since p divides N(x + 4), this product ideal is contained in (p)R. The product ideal 
and (p)R both have norm p?, and therefore they are equal. 

For (c), the respective conditions for factorization in (a) and (b) are that 
Ae ex 71 — D) =0 mod p and x? — D/4 = 0 mod p be solvable. In both cases 
the quadratic expression on the left side has discriminant D. Hence factorization 
occurs if and only if D is a square modulo p. 


23. In both cases we are assuming that (p)R has a factor J = (p,x + 5) with 
0 <x < p. Using Problem 15, let us write o(/) = (p,x +0(5)) = (p,y +4) 
with 0 < y < p. Choose integers c and d with x + 0(6) = cp +d(y + 4). Since 
o(6) = Tr(6) — 4, the equation is x + Tr(d) — 6 = cp + dy + dé, and we obtain 
x + Tr(5) = cp + dy and —5 = dé. Thus d = —1, x + Tr(d) = cp — y, and 
cp =x+y+tTr(6). From0O < x < pandO < y < p, we have0 < x + y+Tr(6) < 
2(p — 1) + Tr(6) < 2p — 1. Soc in the equation cp = x + y + Tr(d) has to be | 
or O, and the equation is x + y = p — Tr(d) or x + y = —Tr(6). The condition that 
o (I) = J is the condition that x = y, hence that 2x = p — Tr(d) or 2x = — Tr(6). 
When D is odd, this says that x = 5(p — 1); when D is even, it says that x = 0. 

24. Since o ((p, x + 6)) = (p, x +0(6)), the two factors are the same if and only 
if o(J) = I. Problem 23 says that the latter equality holds for D odd if and only if 
x= 5(p — 1) and that it holds for D even if and only if x = 0. In the two cases we 
know from Problem 14 that p divides N(x + 6) = x? +x Tr(d) + N(6). 

When D is odd, this result says that p divides x? + x + nae — D), hence that it 
divides 4x7 + 4x + (1 — D) = (2x + 1)? — D. Then p divides D if and only if p 
divides 2x + 1, if and only if x = 5(p — 1). 

When D is even, we know from Problem 14 that p divides x* — m. Hence p 
divides 4(x* — m) = 4x* — D = (2x)* — D. Then p divides D if and only if p 
divides 2x, if and only if x = 0. 

25. Theorem 1.14 shows that the genus group G is the quotient of the abelian group 
H modulo its subgroup of squares. The subgroup of squares consists of the elements 


656 Hints for Solutions of Problems 

in the product of the cyclic subgroups of orders 2"—!, ..., 2&—!, qi! sreatey qi , and the 
quotient is the product of r copies of a cyclic group of order 2. Thus G has order 2”. 
The subgroup of elements of H whose order divides 2 is the product of the 2-element 
subgroups of the cyclic groups of orders 2", ..., 2". It is a product of r copies of a 
cyclic group of order 2 and hence is abstractly isomorphic to G. 


26. If P is a nonzero prime ideal, then so is o(P). Since o? = 1, the mapping 
P + o(P) is a permutation of order 2 on the nonzero prime ideals. Evidently the 
prime ideals of type (i) above are permuted in 2-cycles, and the prime ideals of types 
(ii) and (iii) are left fixed. 

If a nonzero ideal J has prime factorization J = [, Px ,theno(/) =[],;o (P;)*. 
When o (J) = 7, we can match the factors and their exponents. We conclude that the 
factorization of I is as 


; kj kj 
r=(_ TT oc’) PY TF). 
pairs (P;,0(P;)) ideals P; ideals P; 

of type (i) of type (ii) of type (iii) 


Each factor in the first product is of the form (NV (P;))Ki by Problem 19d, each factor 
in the second product is of the form (p)" for some prime p not dividing D, and each 
Pp? contributing to the third factor is of the form (p) for some prime p dividing D. 
The result follows. 


27. For (a), the only nontrivial step in the displayed formula is the third equality, 
which follows because xo (x) = N(x) = 1 by hypothesis. If we take y = (1+ HY hs 
then the displayed formula gives x = (1 +x) + o(x)) l= y—lo(y) as required. 

For (b), the equality o(y)y~! = x remains valid when y is replaced by ny with 
n € Z, and thus we may take y to be in R. Now let y and z be in R with o(z)z~! = 
x = o(y)y~!. Then o(zy~!) = zy7!, and zy! is in Q. Among all y € R with 
o(y)y—! = x, let yo be one with | N(y)| as small as possible; yo exists because |N(y)| 
is an integer in each case. Ifo(z)z! = x, write z = u+v6d, yo = a+b, and a = 
p/q with GCD(p, gq) = 1. Thengu+quvé = qz = pyo = pa+ pbé, and we obtain 
qu = paand qv = pb. Therefore q divides a and b, and q~!yo = q~!a +.q7'b6 is 
in R. Then y = q~!yo is another element in R with o(y)y—! = x, and it contradicts 
the minimal choice of |N(yo)| unless |g| = 1. We conclude that z = pyo. 

28. In(a), N(I7) = N((x)) says that NII)? = |N(x)|N(R) = |N(x)|. Therefore 
N(x-!N()) = |N(x)|7'NC(N()) = |N(x)|TINC)? = 1, and xN(1)7! has 
norm 1. 

In (b), Problem 27b gives us yo € R with o(yo)yq = xN(1I)~!. Then we 
compute that o((yo)1) = o(yo)o (1) = yoxN() “lo (1) = yoNU)!x)o) = 
yoN (1)! Po(1) = yoNU) (NU) = yol- 

For (c), suppose N(yo) > 0. Then Problem 26a shows that (yo)J = (a)Js for 
some a € Z, and this gives the required strict equivalence. If N(yo) < 0, then 
N(yo./m) > 0, and a ((yo./m )I) = (yo./m JI; Problem 26a shows that (yo./m )I 
= (a) Js for some a € Z, and this gives the required strict equivalence. 


Chapter I 657 


29. For (a), since m < 0 and m is neither —1 nor —3, the possible units are 
€ = +1. The equality o(x) = ex says that x is in Z if e = +1, and it says that x is 
in Z./m if e = —1. 

For (b), when m = —1 or m = —3, we have D = —4 or D = —3; thus g = 0, 
and there is nothing to prove. For other values of m < 0, consider Js. Then 
N(Js) =]] pes Ps and this is some divisor D’ of D with no repeated factors. Let us 
write Js = (a,b + g5) by Problem 15. Then ag = D’ and g divides a. Since D’ is 
square free, a = D’ and g = 1. If Js is principal, then (a) shows that Js = (c) for an 
integer c or Js = (d./m) for an integer d. 

Suppose Js = (c). Thenb+6 =rc forsomer € R. Writer = x + yé for 
integers x and y. Thenb+6 = cx +cyé shows that 1 = cy and hence that c divides 
1. Thus Js = R, and the set S is empty. 

Suppose Js = (d./m). Thenb+6 = dx./m-+dyé8,/m for some integers x and y. 
If D is odd, then the equation reads b + (1 —/m) = dx./m+dy5(1— J/m)./m. 
This implies that —5./m = d(x + +dy),/m, hence that —1 = d(2x + 1). Therefore 
d =1, Js = (/m) = (VD), N(Js) = |D|, and S = E. If D is even, then the 
equation reads b — ./m = dx./m — dym, and we obtain —1 = dx. Sod = 1, 
Js = (/m), N(Js) =m = D/4 = D". This is the product of all prime divisors of 
D if D/4 = 2 mod 4 and all of them but 2 if D/4 = 3 mod 4. 

For (c), let E’ be a subset of g members of E, and assume that the element of E 
that is not in E’ is not 2 unless D = —4. If S and S’ are two subsets of E’, then 
Js Js: = (n) Jr, where n = TTpesns’ pand T = (S — S’) U(S’ — S). If Js and Jy 
represent the same genera, then Js Js is principal, and Jy must be principal. The set 
T can be empty only if S = S’, and it has to be a subset of E’ and thus cannot be 
all of E. According to (b), the only way that Jy can be principal is thus that S = S’ 
or that all of the conditions D even, D/4 = 2 mod 4, and T = E’ = E — {2} are 
satisfied. In the latter case the construction of E’ shows that D = —4, T is empty, 
and S = S’. Thus the ideals Js; for S C E’ represent distinct genera in every case. 

For (d), the roots of unity are tek. Since N(€,) = —1, the roots of unity of norm 1 
are the ape, So suppose that ¢ = sr Put ¢9 = SH Then e90 (€9) = N(€o) = 


(-1)", and o(e[x) = o(eg)o(x) = a(en)ex = (—1)"e9 'ex a +(—1)"e7"e7"x = 
#£(-1)"efx = sefx with s = +(—1)". If s = +1, then e/x is in Z, while ifs = —1, 
then efx is in Z./m. Then the same steps as in (b) and (c) finish the argument. 

For (e), the four mentioned ideals are principal, and we have (1) = Js for S 
empty and (./m) = Js for S equal to the set of prime divisors of m. For these two 
ideals, N(1) > 0 and N(./m) < 0. Consider (yg ) and (yo ). The ideal (vg) has 
o((y9 )) = (6 (9 )) = (9 €1) = (9). and hence it is of the form (7) Js for some S. 
Then yg. = nr for somer e€ R, and it follows that nlyt is in R. This contradicts 
the minimality of IN(e I unless |n| = 1. Hence (yg) = Js for some S. Similarly 
(yo ) = Js for some S. Thus all four principal ideals are of the form Js. 

Let us see that the four principal ideals are distinct. Neither ideal (yg ) nor (yo ) 
can equal (1). In fact, if (vg ) were to equal (1), then a would be a unit ¢, and we 


658 Hints for Solutions of Problems 


2 


would have 1 = CORIO) - = o(e)e~! = e~, in contradiction to the fact that 


€, is fundamental. Similarly (yg ) cannot equal (1). 

Since o (yg /m)(yg Vm)! = -o(yg Vim (yg (Vim)! = -o (x OG)! 
= —é€1, the definition of yy shows that Yo. J/m = nyo for some integer n. Passing 
to norms gives —mN on )=n’?N (yp ). Therefore N (yg ) and N (yp ) have opposite 
sign. 

We have seen that two of the four elements 1, yg Yo »Vm have positive norm, 
two have negative norm, and the two of positive norm generate distinct principal 
ideals. To see that the two of negative norm generate distinct ideals, we consider 
separately the cases N(y) ) < 0 and N(yg) < 0. If N(yo ) < 0, we use the equation 
—mN (yg )=n’NnN (yp ) proved in the previous paragraph. If (yg) = (/m), then 
cancellation gives N (vg ) = +1; then Yo is a unit, and we have seen that it cannot 
be. If N (yg ) < 0, we use the definition of VG. in the same way as in the previous 
paragraph to obtain —mN (yo ) = n>N (vg ) for some integer n. Cancellation shows 
that N(yp ) = +1; then yg is a unit, and we have seen that it cannot be. Thus the 
four principal ideals are distinct. 

Now suppose that (x) is any principal ideal fixed by o. As in the statement of the 
problem, we have o(x) = ex for some unit ¢. The most general unit is of the form 


€ = +e}. We shall produce constructively the element of Problem 27 corresponding 
(n+1)/2 
1 


Docu ® 
toe. Put yon = ey if n is even and yoy = € 
have 


yo if n is odd. For n even we 


2 
O(yo.nxX) =O (yon)ex = +o (6! jet x = te 


—n/2 
1 / eix = =xy0,n*, 


and for n odd we have 


1)/2 — 1)/2 
(yonX) = o(Yonex = £0 (EPP yoyetx = +e," To (yo)e?x 


-1)/2 —1)/2 
be" v o(yo)x = +." Mt Worx = Lynx. 


Thus 0 (yo,nx) = +y0.nx for all n. Therefore yo,,x is in Z or in Z,./m, depending 
on the sign ++. Depending on the sign, |N(yo,nx)| = |N(0,n)||N(x)| thus is either 
the square of an integer or m times the square of an integer. If n is even, then 
|N (yo,n)| = 1, and |N(x)| is therefore either the square of an integer or m times the 
square of an integer. Since | (x)| is the value of the norm of (x), there are only two 
possible S’s for which this can happen. If n is odd, then |N(yo,n)| = a for a certain 
square-free integer > 1, as we have seen. Therefore | N(x)| has to be either a~! times 
the square of an integer or ma~! times the square of an integer. So there are only two 
possible S’s in this case. Thus there are only four possible S’s in all cases, and these 
have been accounted for. So the number of principal ideals among the Js’s is exactly 
four. To complete the proof, we now argue as in (c) but consider only possibilities 
for which the product of two Js’s is n? times one of the two Js’s given by a principal 
ideal with a generator of positive norm. 


Chapter I 659 


30. Since D is fundamental, (a1, b;, cy) is automatically primitive. Then Lemma 
1.10 produces a properly equivalent form that represents some integer a relatively 
prime to D. The rest follows from the argument in the second paragraph of the proof 
of sufficiency in Theorem 6b. 


31. For (a), choose an integer r such that b + 2ar = kD for some integer k; this is 
possible because GCD(D, 2a) = 1. Then the translation x = x’ + ry’, y = y’ leads 
fromax?+bxy+cy? toax’*+kDx'y'+c'y'? for some c’. The discriminant of the new 
form is still D = kD? — 4ac’, and thus 4ac’ = 0 mod D. Since GCD(4a, D) = 1, 
c’ =0 mod D. 

For (b), b has to be even because D = b? — 4ac is even. Write b = 2b. Choose an 
integer s such that b+as =kD for some k; this is possible because GCD(a, D) = 1. 
Then the translation x = x’ + sy’, y = y’ leads from ax* + bxy + cy* to 
ax'? + 2kDx'y’ + c'y” for some c’. The discriminant of the new form is D = 
4k? D* — 4ac', where c! = (4a)~! D(4k?_D — 1) = a7!(D/4)(4k7_D — 1). Modulo 
D, this expression is —a(D/4), where a is an integer with aa = 1 mod D. Here 
a is odd, and hence a* = 1 mod 8. If 2" is the exact power of 2 dividing D, 
then da = 1 mod 2”, and hence a4 = a mod 2". If p is any odd prime dividing 
D, then p divides D/4, and hence a(D/4) = 0 = a(D/4) mod p. Therefore 
a(D/4) = a(D/4) mod D, and we conclude that c’ = —a(D/4) mod D. 

32. For (a), clearing fractions in the expression ax* + kDxy +1Dy* =r yields 
au? +kDuv+lDv* = rw?. Supposea prime p divides GCD(w, D). Then p divides 
au2. Since GCD(a, D) = 1, p divides u. Referring back to the equation, we see that 
p’ divides au* and k Duv, hence divides ! Dv’. Thus p divides /v~. The discriminant 
is D = k* D* — 4alD, and divisibility of 1 by p would force p” to divide the left side 
D. Hence p does not divide /, and p must divide v. Then p divides both u and v, 
in contradiction to the minimality of the common denominator w. We conclude that 
GCD(w, D) = 1. Taking the equation au? +kDuv +1Dv* = rw? modulo D gives 
au? = rw* mod D. Since r and w are relatively prime to D, so is u. Thus we can 
rewrite this congruence as a = d*r mod D for some integer d relatively prime to D. 

For (b), the same argument gives a’ = d’*r mod D. Since d is relatively prime to 
D, we can rewrite the congruence fora asr =d —2q mod D, and then a’ = d’ a= 
(d~'d')’a mod D. 

For (c), the given forms are properly equivalent over Z to (a,kD,1D) and to 
(a’, k'D, I'D), respectively, by Problem 31a. Proper equivalence over Q means that 
the two forms take on the same rational values, one of which is the integer a’. Part 
(b) therefore shows that a’ = as* + nD for some integers s and n, necessarily with 
GCD(s, D) = 1. Modulo D, the forms are given by ax? and a’x’”, and the first 
can be transformed into the second by the substitution x = sx’, y = so! y’, where 
s—! is the multiplicative inverse of s in Z/DZ. In fact, substitution into ax? gives 


a(sx’)? = (as*)x’* = ax’? mod D. This substitution is given by the matrix ( a ) 
in SL(2, Z/DZ). 


33. Part (a) is almost the same as Problem 32a. Clearing fractions leads to 


660 Hints for Solutions of Problems 


au? +kDuv + (1D — a(D/4))v* = rw’, and the argument that no odd prime p 
divides GCD(w, D) is the same. Suppose that 2 divides w. The equation modulo 4 
is then au? — a(D/4)v” = 0 mod 4 with D/4 congruent to 2 or 3 modulo 4. Since 
2 divides w, at least one of u and v must be odd. If D/4 = 3 mod 4, the congruence 
becomes a(u2 + v’) = 0 mod 4, which is impossible with at least one of u and v 
odd. If D/4 = 2 mod 4, the congruence becomes a(u* + 2v”) = 0 mod 4, which 
again is impossible with at least one of u and v odd. Thus GCD(w, D) = 1. Taking 
the equation modulo D and using the invertibility of r and w modulo D, we have 
ar~'w~?(u? — (D/4)v7) = 1 mod D. 

For (b), let p be an odd prime divisor of D. The above congruence then becomes 
ar~'w~?u? = 1 mod p. Similarly with the second form, there is some w’ prime to 
D such that a'r~!w'~?u’? = 1 mod p. Comparing the two expressions, we see that 
a modulo p is the product of a’ and an invertible square. 

For (c), the above congruence becomes ar 'w-2(u2+v2) = 1 mod 4. This forces 
u2 +v* = 1 mod 4. Since w has to be odd, w2 = 1 mod 4. Hence ar! = 1 mod 4. 
Similarly a’r~! = 1 mod 4, and therefore a = a’ mod 4. 

For (d), the above congruence becomes ar Gt — (D/4)v?) = | mod 8, since w 
is odd. If D/4 = 2 mod 8, we obtain ar~!(u* — 2v”) = 1 mod 8. Here u has to be 
odd, and thus ar—'(1 _ 2v*) = | mod 8. If v is even, this says thata =r mod 8; if 
v is odd, it says that a = —r mod 8. Putting this conclusion together with a similar 
conclusion about the second form, we obtain a’ = +a mod 8. 

If D/4 = 6 mod 8, we obtain ar~! (uw? + 2v”) = 1 mod 8. Here u has to be odd, 
and thus ar~!(1 + 2v2) = 1 mod 8. If v is even, this says that a = r mod 8; if v 
is odd, it says that a = 3r mod 8. Putting this conclusion together with a similar 
conclusion about the second form, we obtain a’ = a mod 8 or a’ = 3a mod 8. 

For (e), we shall assemble a member of SL(2, Z/DZ) one prime at a time and 
use the Chinese Remainder Theorem. For odd primes p dividing D, choose sp with 


d= sya mod p, and introduce the matrix M, = G : )in SL(2, Z/pZ). If D/4= 


-1 
0s, 


3 mod 4, introduce the matrix M> = ( 0) in SL, Z/4Z). If D/4 = 2 mod 4, let 


= G :) in SL(2, Z/8Z, if D/4 = 6 mod 8, and let My = G z) in SL(2, Z/8Z) 
if D/4 = 2 mod 8. The Chinese Remainder Theorem produces a unique matrix with 


entries in Z/ DZ that is congruent to M, modulo each odd prime divisor of D and is 


congruent to Mz modulo the power of 2 dividing D. Call this matrix M = é A : 


It has determinant | modulo D and hence lies in SL(2, Z/ DZ). Then substitution of 
x = ax’ + By’ and y = yx’ + dy’ into the form a(x? — (D/4)y) modulo D leads 
to the form a’ (x* — (D/4)y?) modulo D. 


34. These problems establish a function from the set of equivalence classes of 
binary quadratic forms over Z with discriminant D, the equivalence relation being 
proper equivalence over Q, onto the set of equivalence classes of binary quadratic 
forms over Z with discriminant D, the equivalence relation being proper equivalence 


Chapter I 661 


over Z/DZ. The number of elements in the domain has to be > the number of 
elements in the range. 


35. The steps in solving Problems 32 and 33 involve relating a to r modulo 
each prime power dividing D. These relationships are the same as the relationships 
between a and r’ if the form modulo D represents r’ and GCD(r’, D) = 1, and the 
relationships are transitive. Thus the genus characters take the same values at r as 
they do at r’, and they take the same values at a as well. 


36. Multiplication is the operation on proper equivalence classes of forms that 
corresponds to composition of aligned representatives of the classes, and composition 
is defined in such a way that the set of values of the composition is the set of products 
of a value of one form by a value of the other. The values are unaffected by proper 
equivalence over Z. 


37. For (a), D/4 has an odd number 2r + 1 of prime factors 4k + 3. Use of the 
Jacobi symbol with a odd and p varying over the prime divisors of D/4 gives 


N@=- 1 @T @=@" T ® 0 O®=s@o. 


poe: “pong Yo eagis pa4k-1 ” paak-+43 


Therefore 


ea T1(5) = (29) = QP) = @). 


For (b) and (c), say that the number of prime factors 4k + 3 of D/8 is t. With 
p varying over the odd prime divisors of D, the same computation as above gives 


I (2) = &(a)' (2). Then (?) = 2)(?) = n@é@' iH (2). One easily checks 
that ¢ is even if D/4 = 2 mod 8 and is odd if D/4 = 6 mod 8, and the result follows. 


38. For each odd prime divisor p of D, choose a residue rp modulo p such that 
(*) = Sp. If D is even, choose an odd residue rz modulo 8 such that a(r2) = 59. 
The Chinese Remainder Theorem produces an integer b prime to D such that b = 
rp mod p for the odd p’s and b = r2 mod 8. For this integer b and every k > 0, we 
have (PARP) = rp for each odd p and a(b + kD) = s2. Dirichlet’s Theorem says 
that b + kD is a prime q for a suitable choice of k, and this prime qg has the required 
properties. 


39. Problem 37 showed that the product of the genus characters for an odd integer 
a such that GCD(a, D) = 1 is (2). Using the genus characters at a = q, we see 
that (2) = |. Theorem 1.6b shows that g is primitively representable by some form 
(q, b,c) of discriminant D. The values of the genus characters for this form are 
their values on q, and we have arranged that these values are the various numbers 
Sp. Since there are g + 1 genus characters and the first g of them can be specified 
arbitrarily and still give a similarity class modulo D, there are at least 2% similarity 
classes modulo D. 


662 Hints for Solutions of Problems 


40. Problem 29 shows that the number of classes of type (i) is exactly 2*. Problems 
30-33 show that equivalence of type (i) implies equivalence of type (ii), and they 
therefore give a mapping of the set of classes of type (i) onto the set of classes of 
type (ii). The definition of “similar modulo D” immediately implies that equivalence 
of type (ii) implies equivalence of type (iii), and therefore we obtain a mapping of 
the set of classes of type (ii) onto the set of classes of type (111). Finally Problem 39 
shows that there are at least 2% classes of type (iii). The result follows. 


Chapter IT 


1. The unital left CG modules correspond (via the universal mapping property of 
a group algebra) to representations of G on complex vector spaces. The theory in 
Chapter VII of Basic Algebra shows that every representation splits as the direct sum 
of irreducible representations, which correspond to simple left CG modules. Hence 
every unital left CG module is semisimple. The left regular representation of G, 
which corresponds to the left CG module CG, decomposes as the sum of irreducible 
representations, each irreducible representation occurring as many times as its degree. 
The sum of all the irreducible subspaces of a given isomorphism type gives one of 
the factors M,(C) of CG, and every factor arises this way. 

2. For (a), rad A = (C + CX)(X? + 1), and S will be the sum of two copies 
of C. Finding S requires some computation. We can identify A/(rad A) with the 
quotient C[X]/(X? + 1), and direct computation shows that the two idempotents in 
this notation having sum | are x (X +7) and — 4 (X — i). The proof of Proposition 
2.23 shows how to lift these to idempotents in A. For the first one, puta = x (X +i) 
and b = 1—a = —+4(X — i), and observe that (ab)* = 0. The proposition 
gives the formula e = wae (atta = a‘ + 4a%b, the term for k = 2 being 0. 
Then e = (a+ 4b) = ig(X + i)3(—3X + 5i). So one contribution to S comes 
from Ce; the other will come from the complex conjugate in the form of Cf, where 
f = 4 (X — i)? (-3X — Si). 

We can check directly that e is an idempotent. In fact, 


e —e =e (X +i)°(-3X + 5i) — 1]. 


The polynomial in square brackets vanishes at X = i, and so does its derivative. 
Thus the polynomial is divisible by (X — i)”, and e? — e = (X +i)?(—3X + 5i)x 
[(X — i)? O(X)] is divisible by (X? + 1)?. 

For (b), the answer is yes. This problem anticipates Problem 5 below. The algebra 
S is spanned linearly by its idempotents, and Problem 5 shows that the idempotents 
are determined uniquely in the commutative case. 

For (c), rad A = (R+RX)(X? +4 1). Call the subalgebra Sp. This subalgebra will 
be a 2-dimensional real subalgebra isomorphic to C. To find it, we can go through 


Chapter II 663 


the proof of Theorem 2.17 or we can use the Galois group. The latter method is a 
good bit easier. Thus we seek those members of S as in (a) that are fixed by complex 
conjugation. Since S = Ce + Ce, the result is that So = R(e + e) + iR(e — é). This 
is unique; in fact, any choice of So has the property that So ®p C is an S for (a), and 
we know that the S for (a) is unique. 


3. Since rad A is a nilpotent ideal of A, (rad A) @F B is a nilpotent ideal of 
A ® Ff B, and therefore (rad A) @r B C rad(A @f B). For the reverse inclusion 
Proposition 2.31 shows that rad(A @- B) = I ®@ Fr B for some two-sided ideal of A. 
If (rad(A @ pf B))” =Oanday,..., a, are in J, then (aj @ 1)--- (a, ® 1) must be 0, 
and hence a --- ad, = 0. Therefore J C rad A, and rad(A @f B) C (rad A) @f B. 

4. For (a), suppose on the contrary that there is an infinite sequence Mj, M2,... 
of distinct maximal ideals. Then we obtain a decreasing sequence of ideals R D> 
M, > M, M2 > M, MM; 2@ ---, and the Artinian property shows that M, --- M, = 
M,---My,Mn+i for some n. Since M,+41 is prime and Mj4; > M1 ---Mn, Mn+i 
contains M; for some j with 1 < j <n. By maximality, M, = Mj, and we have a 
contradiction. 

In (b), every element of rad R is nilpotent because rad R is nilpotent. Conversely 
if x € R is nilpotent with x” = 0, then Rx is nilpotent with (Rx)” = 0, since 
A, XA2QX +++ AynX = aja2+++anx”" = 0 for any aj,...,d, € R. Thus Rx C rad R, and 
the nilpotent element x lies in rad R. This proves (b), and (c) follows because R is 
semisimple if and only if rad R = 0. 

For (d), R semisimple implies that R is a product of full matrix rings over division 
rings. Commutativity implies that the matrices are all of size 1-by-1 and the division 
rings are all fields. 


5. If e’ is a second representative, then e’ = e +r withr € rad R. If n is an odd 
integer large enough to have r” = 0, then 


n n—1 
O=r"=(e'-e)" = Db (-1)* (Ge) te =e’ + DK 1*(ee-e 
=0 k=1 


n 
=e+(¥ (-I'({) Jee ee tee -e =e +0 ele + e'e-e =e’ -e. 
k=0 


6. Let M),..., M, be the finitely many maximal ideals, and put N = M,--- My. 
Nakayama’s Lemma says that if J is any ideal contained in all maximal ideals, then 
the only finitely generated unital R module M having the property that /M = M is 
M = 0. The Artinian property shows that N‘+! = Né for some k. We take J = N 
and M = N* in Nakayama’s Lemma. The R module M is finitely generated because 
Artinian implies Noetherian (Theorem 2.15), and hence Nakayama’s Lemma shows 
that Nk = 0. 


7. Let the maximal ideals be Mj,..., My, and let (M, --- M,)* = 0. If Pisa 
prime ideal, then P DO = (M --- M,)*. Since P is prime, P contains one of the 
factors. Thus P > M; for some j. 


664 Hints for Solutions of Problems 


8. It helps to have a multiplication table available. If the rows index a factor on 


the left and the columns index a factor on the right, then the resulting products are 
RM 0 


given by {| 0 0 Mm 


00S 
If Jy is a left ideal of S and J, is a left R submodule of R @ M containing Mh, 
then Rnb = 0, MIn C Jy, and Sh C bh. Also, RI; C , MI, = 0, and Sl, = 0. 
Thus AJ; C J; and Alb C I; @ ky. Consequently J; @ Ip is a left ideal of A. 


In the reverse direction if J is a left ideal in A, then J; = e ) JCR@OM 
and Ih = Ga) J C Sare such that J = [,) @h. Also,r € R implies ( 
=a hg C Ii, while (M@ @ S)I; = 0; and s € S implies G : 
yee C Ih, while Rly = 0 andm e€ M implies (? - Ga i 


(; 
(00) (03) ¥ 5 (oo) 4 =" 


9. For (a), suppose A is left Noetherian. The table produced in the solution of 
Problem 8 shows that M @ S and R @ M are two-sided ideals of A, and the respective 
quotient rings are R and S. As quotients of a left Noetherian ring, R and S have to be 


left Noetherian. If {M;} is an ascending chain of R submodules of M, then | (° ) | 


is an ascending chain of left ideals of A, by Problem 8. The latter must be constant 
from some point on, and then the same thing is true for {M;}. 

Conversely suppose that R and S are left Noetherian and that the left R module M 
satisfies the ascending chain condition. If {J;} is an ascending chain of left ideals of A, 
then the corresponding sequence {(/2);} is an ascending chain of left ideals in S, and 
{(7,);} is an ascending chain of left R submodules of R @ M containing M15. Since 
S is left Noetherian, {(/2);} is constant from some point on. Since R = (R ®@ M)/M 
and M satisfy the ascending chain condition for their left R submodules, so does 
R @ M, and therefore {(/;);} is constant from some point on. 


10. In view of Problem 9a, showing that A is left Noetherian amounts to showing 
that R and S are (left) Noetherian and M satisfies the ascending chain condition for 
its left R submodules. The ring S$ is Noetherian by assumption, and R is a field, 
hence is Noetherian. The action of R on M is the action of a field on itself, and the 
R submodules are trivial. In view of Problem 9b, A fails to be right Noetherian if the 
ascending chain condition fails for the right S submodules of M = R. If the ascending 
chain condition were to hold, then R would be a finitely generated S module, and 
the only denominators needed for members of the full field R of fractions would be 
those dividing the product of the denominators of the generators; these fractions are 
already in S, and hence S would equal R, contradiction. 

The analogs of the results of Problem 9 for the Artinian case show that A fails to 
be either left or right Artinian if S$ is not Artinian. If s is a nonunit in S, then the 
chain of principal ideals {(s*)} is properly descending, since (s*) = (s**t!) implies 


es* = s+! for some unit ¢ and since the hypothesis that S is an integral domain 


Chapter II 665 


allows us to cancel and obtain ¢ = s, contradiction. 


11. Since R and S are fields, they are left and right Noetherian and Artinian. In 
view of Problem 9, we are to show that M = R satisfies both chain conditions for 
its left R modules and neither chain condition for its right S modules. Since R is a 
field, M = R has only trivial R submodules and satisfies both chain conditions. For 
the S action on R, we are to examine the S vector subspaces of §. Since dims R 
is infinite, there exist both a properly increasing sequence of such subspaces and a 
properly decreasing one. Hence neither chain condition is satisfied. 


12. For (a), the vector-space dimension over F is certainly 4, and computation 
shows that A is closed under products. The choices a = | and b = 0 show that A 
has an identity. 


For (b), let x £ 0 be in a two-sided ideal J. If x = (is : ). then x is invertible, 


0 
(a) 


and hence J = A. Otherwise suppose that some matrix x = Ge on with b £0 


Be ; ; _ 0 2b/m \ ; 
is in J. With c as in the statement of the problem, cx — xc = or Ova o ) is 
in J; this matrix is invertible since b ¥ 0, and thus J = A. 


To see that A is central, let x be in the center. The computation 0 = cx — xc shows 
that b = 0. Thus x is of the form e oe ). Such an x does not commute with (° . 
unless a = o(a), in which case x is in F’. 


13. The determinant is aa(a) — rbo(b) = Nx;r(a) — rNx;r(6) and equals 0 
for a given r if and only if some pair (a,b) # (0,0) has Nx;r(a) = rNx/Fr(b). 
Since r 4 0, both a and b are nonzero, and this equality then holds if and only if 
r = Nx/r(ab™). 

In other words, some nonzero member of A has determinant 0 if r is a norm, and 
then A cannot be a division algebra. Conversely ifr is not a norm, then every nonzero 
member of A is invertible as a matrix. Computation of the inverse matrix shows that 
it has the correct form to be in A. Hence A is a division algebra. 

When A is nota division algebra, it is anyway finite-dimensional and central simple 
and has to be of the form M,(D) for some n and some division algebra D over F 
such that dim M,,(D) = 4. The dimensional formula says that n? dimp D = 4. Since 
n # |, we must have n = 2 and D= F. 


24 
14. The isomorphism follows from the computation e °) ( ne ) ( te ’) = 


01 ra(b) o(a) 01 
a be 2) a be it a be 
rc_!o(b) o(a)) ~— \r'a(c)o(b) o(a) ) ~ \r'a(be) o(a) }’ 


15. Direct computation. 


16. If K is a maximal subfield, then dime K = 2. Since the characteristic is not 2, 
K = F(./m) for some nonsquare m € F. Define i € K be to ./m. 

The map f : K > D givenby f(a+ bi) =a — bi is an algebra homomorphism 
into the central simple algebra D. So the Skolem—Noether Theorem produces j € D 
with j(a + bi)j~! = a — bi for all a + bi in K, necessarily with j invertible. 


666 Hints for Solutions of Problems 


As in the proof of Theorem 2.50, j* = r lies in F. Define k = ij. Then k* = 
ijij = iGij-D jf? = i—Dj* = —rm, and —rm = k* = ijk implies that k = 
—rm(j')G7!) = —rmer" fn) = Ji. 

Let us check the multiplication table for {1, 7, j, k}. We know that i? = m, ‘i =r, 
k? =-—rm, ij =k,and ji = —k. In addition, we have 


jk = jij = (ij)? = (dr = -ri, 
kj sijj =i(P) =r, 

ki = iji = i(jij")j = i(-Dj = —mij, 
ik =iij = (i’)j = mj. 


Hence the F linear map ¢ from A into the given central simple algebra is an algebra 
homomorphism sending 1 into 1. Since A is simple, g is one-one. Since A and the 
given algebra both have dimension 4, g is onto. Thus ¢ is an algebra isomorphism. 
(We did not have to check directly that {1, i, 7, k} is linearly independent over F.) 


17. A is an algebra by routinely checking that it is closed under multiplication. 
Manifestly A has an identity and has dimension 9 over F. If J is a nonzero two-sided 
ideal in A, let x = a+bj+cj7 be nonzero in J, and assume that x is chosen in J such 
that as few of the coefficients a, b, c are nonzero as possible. Possibly by multiplying 
x by j or j* on the right, we may assume that a # 0. Choosed € K withd, o(d), and 
o7(d) distinct. Computation shows that dx — xd has one fewer nonzero coefficient. 
By minimality we must have dx — xd = 0; hence x must have had just one nonzero 
coefficient. Such an x is invertible, and thus 1 is in J and J = A. Hence A is simple. 
To see that A has just F as center, we test a general element x = a + bj + cj* for 
commutativity with both d € K and the element j, and we find that b = c = 0 and 
a =o(a) =o07(a). 

18. Since A is finite-dimensional central simple, A = M,,(D) for some n and 
some central division algebra D over F. Then9 = dim A = n* dime D, and the only 
possibilities are that n = 3 and D = F, or that n = 1. In the first case, A = M3(F), 
and in the second case, A is a division algebra. In the first case any column of A 
(when viewed as M3(F)) is a 3-dimensional left A module; in the second case A has 
no proper nonzero left A modules. 


19. Left multiplication by K makes A into a K vector space, and the left K 
submodules of A are the K vector subspaces. The F' dimension of such a subspace 
is 3 times the F' dimension. Hence the left K submodules of A are the subspaces of 
K dimension 1, which consist of all left K multiples of any nonzero vector. 

Letx =aot+boj+coj 2 be nonzero in A. Then Kx is aleft A module if and only if 
jx lies in Kx. Here jx = 0 (ao) j+o (bo) j? +0 (co) j*? = ro (co) +a (ao) j +0 (bo) j?. 
This equals dx for some d € K if and only if 


ra(co) = dao, oa (ao) = dbo, and a (bo) = dco. (*) 


Chapter III 667 


Combining the second and third equations gives the necessary condition that 07 (ay) = 
oa (dbo) = a(d)a(bo) = a(d)dco. Applying o gives the necessary condition ag = 
a3 (ay) = o(a(d)dco) = 07 (d)o (d)o (co) = 07 (d)o (d)r~!day = Nxjr(d)r~'ao. 
Thus it is necessary that some d € K have Nx ;r(d) =r. Conversely if d € K has 
Nx;r(d) =r, then x9 = 1+d7!j +d7!o(d)"'j? has ay = 1, by = d7', and 
co=d —lo(d)7! , and we observe that the conditions (*) are satisfied; thus K x9 is a 
left A submodule. 


Chapter IIT 


1. For (a), define f : A x K — Endg A by f(a,c)(a’) = aa'c just as in the 
proof of Theorem 3.3. The verification that the action of right multiplication by b € B 
commutes with f(a, c), ie., that f(a, c) is in Endgo A, uses that B commutes with 
K, and the verification that the extended map f : A @r K — Endg A respects 
multiplication uses that K is commutative; otherwise the argument is the same as 
with Theorem 3.3. The algebra A @  K is central simple over K, and B is an algebra 
over K because B contains K. Since A @F K is simple, f is one-one. 

For (b), let V be the unique-up-to-isomorphism simple finite-dimensional left B 
module. If the left B module B is the direct sum of m copies of V, then the proof 
of Theorem 2.2 shows that B° = Endg B = M,,(D°), where D® is the central 
division algebra over K given by D° = Endg V. Hence B = M,,(D). If V° 
denotes the unique-up-to-isomorphism simple finite-dimensional left B° module and 
if D’? = Endgo(V2), then we have B = Endgo(B°) = M,,(D’°), and it follows that 
m =m and D’ = D°. 

Since B C A, A is a right B module, hence a left B° module, and A has to 
be the direct sum of some number n of copies of V°. Then the same argument 
gives an isomorphism Endgo A = M,(D'°) = M,(D). The Double Centralizer 
Theorem gives dimr A = (dimr B)(dimy K), and thus dimx A = dimr B = 
(dimpr K)(dimx B) = (dimr K)(mdimx V). Meanwhile, dimx A = ndimx V 
and thus ndimk V = (dimr K)(mdimgx V). Son = mdimp K. Consequently 
dime Endgo A = n? dimpr D = m?(dimr D)(dimp K)* = (dimp B)(dimp K)* = 
(dimp A)(dimp K) = dimr(A @F K), and the map f in (a) is onto. 

For (c), application of (b) and an isomorphism from above gives A @r K = 
Endge(A) = M,,(D), and we have seen that B = M,,(D). Thus A @F K and B lie 
in the same Brauer equivalence class in B(K). 

2. Take the product over o of the equality p(a(o, T))a(p, oT) = a(p, a)a(po, T), 
and get (Te a(o, t)) Tl. @(.o) = [[, ap. o)[], ao, t). Canceling gives 
(TI. a(o, t)) = ||, a(o, t). Thus [], a(o, T) is fixed by every member of the 
Galois group and is in F*. 

3. Proposition 3.32 and Theorem 3.31 show that H*(Gal(K/F), K*) 
H?(Gal(K/F), K*) fork > 1 and H***!(Gal(K/F), K*) = H!(Gal(K/F), K*) 


~~ 


668 Hints for Solutions of Problems 


for k > 0. Then Corollary 3.34 gives H* =~ F* /Nx/p(K*) for all k > 1, and 
Theorem 3.17 gives H**+! = 0 forall k > 0. Finally H° is the subgroup of elements 
in K* fixed by Gal(K /F), and this is F*. 


4. For (a), it is shown in Chapter IX of Basic Algebra that Q(e?7'/?) is a Galois 
extension of Q with cyclic Galois group of order p — 1 whenever p is prime. Here 
p = 7. Complex conjugation is a member of the Galois group of order 2, and K is the 
subfield fixed by this subgroup. Hence K has degree 6/2 = 3 over Q, and its Galois 
group is the quotient of a cyclic group of order 6 by the subgroup of order 2, hence is 
cyclic of order 3. The powers ¢ bec shat ge form a basis of the Q vector space Q(¢), and 
the sums of them with their images under complex conjugation span K. These sums 
are T1, T2, T3. Since there are only 3 such sums, they must be linearly independent 
over Q. Put % = ¢* + ¢7-*. Then t% depends only on k mod 7, and % = T_x. 
Hence the only t;’s that are not any of 7), T2, T3 are the ones with k = 0 mod 7. The 
members of the Galois group of Q(¢) carry ¢ to ¢* for 1 < k < 6 and therefore carry 
T tO T;, T2 to T2%, and T3 to T3x. None of k, 2k, 3k is divisible by 7, and the result 
follows. 

For (b), let o € Gal(K /Q) have o(t1) = 12, o(t%2) = 73, and o(73) = tT. For 
x € K, we have Nx/g(x) = xo (x)o7(x). With x = ati + bt. +73, we get 27 terms 
when everything is expanded out, and they are the ones listed. 

For (c), T) + T2 + 73 = —1 because 5 ra = 0. Next, tj tT = (¢! + ape 
(7 +¢77) = 4¢%+4¢7!4+¢! = 1 +14, and the other two identities on the 
second line are similar. Finally te =(¢!4¢7!?% = ¢7424¢- = +2, and the 
other two identities are similar. 

For (d), let a, 8, y, 6 be the expressions involving T1, T2, T3 on the right side in 
(b). First we have te — ta = (12 + 2)T] = TT2 + 27, = 37] + 73. Summing this 
expression and similar expressions for a and a gives a = 4(t) +12 +73) = —4. 
Second 6B = 111273 = (1% + 73)T3 = T2 +73 +7, +2 = 1. In (d), the coefficient 
of abc isa +38 = —4+3 = —1, and the coefficient of a? + b> +c} is B = 1. 
Third t/t = 1(1 +) = (2+2)4+(12+B) = B+2%42. Similarly 
1373 = T +213+2 and esl = ™+21,+2. Thesumis y = 3(t} +12+73)+6 = 3. 
Fourth tT ie = 71 (733 +2) = 72 +73 4+ 2t;. Similarly Tt = T + 13 + 2T and 
13T/ = ++ 203. The sum is 6 = 4(q, + 2 + 73) = —4. 

For (e), the norm modulo 3 is (a3 +b>+ c?) —abc— (ac +ab?+ bc*), and this is 
= (a+b+c)—abe — (a2c +ab* + bc?) mod 3. Any nonzero square is = 1 mod 3, 
and we consider cases. If 3 does not divide abc, then a2 = b* = c* = 1 mod 3, and 
the norm is = —abc # 0 mod 3. If 3 divides a but not bc, then b* = c* = | mod 3, 
and the norm is = (b+ c) —-b =c #0 mod 3. If 3 divides a and b but not c, then 
the norm is = c # 0 mod 3, while if 3 divides a and c but not b, then the norm is 
= b £0 mod 3. The case that 3 divides all of a, b, c is excluded by the condition that 
GCD(a, b, c) = 1, and all other cases are handled by symmetry. Thus in all cases 
the norm is not divisible by 3. 

For (f), let x, y, z be members of Q not all 0. Choose integers a, b, c and relatively 


Chapter III 669 


prime integers n and d such that x = n7'da, y = nd~'db, z = nd7'c, and 
GCD«a, b,c) = 1. Then Nx g(t + yt. +273) = dF Nxg(ati + bt. +.c73). 
Applying (e) and supposing that 3 is a norm, we obtain 3 = d~*n>(3k + (J or 2)) 
for some integer k. Thus 303 = n> (3k + (1 or 2)). This equality forces n to divide 
d, and we may therefore take n = 1. Thus 3d> = 3k + (lor2). The left side is 
divisible by 3, and the right side is not. Hence 3 is not a norm. 

5. For (a), Dirichlet’s Theorem (Theorem 1.21) says that there are infinitely many 
primes of the form p = kn + 1. For any such p, n divides p — 1. For (b) with this p, 
the Galois group of Q(e?7‘/?)/Q is cyclic of order p — 1 and has a cyclic subgroup 
of order (p — 1)/n. The corresponding subfield is a Galois extension of Q of degree 
n with cyclic Galois group. 

6. For0 < k <nand0 </ <n, we have xgexg: = j*j! = j**'. Meanwhile, 
Xget equals j*+! if k +1 <n and equals j**!-” ifk +1 > n. So xgkXqt = Xget if 
kK+1 <nand xgkXgi = j’Xgkti-n = rXgkti-n if kK +1 > n. Thus a(o*, o’) has the 
stated value. 

7. It is just a question of checking that c,.o*(cg:) = a(o*, o)ege+ with a(o*, a’) 
as in the previous problem. 

8. We have do(1, o£) = 1 — o* and thus 


Hilyo*) = le Se )Ed +o fete"): 


If we put f) (1, 0%) = —(1+o+---+o*~!), then we have Tf} (1, 0%) = fodo(1, o*) 
for all k. 

Next, for k < 1, we have 4(1,0%,o0!) = (o%,o') — (1,0!) + (1,o%) = 
ok, o!-*) — 0!) +1, o). Then f, 41 (1, o*, a) equals 


— (tote toh * 4 (tot--tol)-U+o4---+0*!) =0. 


For k > J, the term (o*, 0!) is replaced by o*(1,0"t!-*). Thus 4) (1, 0%, 0!) = 
o* (1, o"t!-*) — (1, 0') + (1, o*). Then 4; (1, o*, 0’) is 


—o (tot to") 4 tot: to) -d+o4--+0%}) 
=-(l¢o+- to") + (1+o4---+0774) 
=o'(-(t+o4---+0"7})). 

If we define f> as in the problem, then in the two cases we have 

k<l:  Nfp(l,o*,o') =(1+o04---+0"'!)0) =0= fi: (1, o*, 0), 

k>l: Nf(i,o*,o})=(1+04---+0" Co’) = fia (1, o*, 0°). 


9. To w in Homzg(ZG, K~), the chain map of the previous problem associates 
wo f2 in Homzg(ZG({(1, g1, g2)}), K*), and then the corresponding member 
of CG, K*) is ®o(Wf2) whose value at (g1, g2) is Wf2(1, g2, g1g2). That is, 
O7(Wf2)(o*, 0!) = Wh, of, of), and this by Problem 8 is (0) ifk +1 <n 
and is w(—ok+!-") = wok") fk 41> n. 


670 Hints for Solutions of Problems 


10. Taking Proposition 3.32 into account, we see that the mapping whose kernel 
gives the cocycles is Hom(T, 1) : Homzg(ZG, K*) — Homzg(ZG, K*). Here 
Hom(T, lw = ywoT. We are identifying y% with w(1) and also Ww o T with 
w(TQ()) = wo — 1) = (o — 1)W(1) in additive notation. Hence the effect of 
Hom(T, 1) is to carry y to o(y)y~! in multiplicative notation. A necessary and 
sufficient condition for 7 (y)y~! to be 1 is that y be in F™, since the subgroup of K * 
fixed by Gis F*. 

11. Since W(0) = 1 and w(o*t!—”) = o *'“"W(1) = W(1) = r7!, the member 
a of CG, K %) that corresponds to y has 


1 ifk+l <n, 


alo*k,o!)= ; 
r ifk+l>n, 


and this is the 2-cocycle of Problem 6. 


12. Corollary 3.34 and Theorem 3.14 combine to give us a group isomorphism 
B(K/F) = F* /Nx/F (K *), and the above problems show that the element r of F'* 
used in defining A corresponds under this isomorphism to the coset of r~!. Hence 
the order of the Brauer equivalence class of A equals the order of the coset of r, as 
required. 

If A is not a division algebra, then A = M,,(D) for some central division algebra 
D over F and for some integer m > 1. Here dimp D = (n/m)? < n?. Corollary 
3.15 then gives the contradiction that the order of the Brauer equivalence class of D, 
which is the same as the order of the class of A, divides n/m, which in turn is <n. 


13. The Skolem—Noether Theorem shows that the image matrices under two 
different isomorphisms g and w have to be conjugate to one another, say with g¢ = 
C-'wC. Then 


det(p(X1 — a @ 1)) = det(C7!W(C(X1 —a @ 1))) 
= (det C)~! det(W(X1 — a @ 1))(detC) 
= det(W(X1—a@1)). 


14. Let B = A @p K. The left B module B is semisimple and is the direct sum 
of n isomorphic simple modules of dimension n. On each the operation of a @ | has 
characteristic polynomial det(X 1 — a @ 1), and the characteristic polynomial for the 
direct sum of the spaces is the product of the characteristic polynomials. 


15. Arguing by contradiction, we may assume that the statement is false for 
some monic P = P(X) and that P has the lowest possible degree among all monic 
polynomials for which the assertion is false. Factor P over K into powers of distinct 
irreducible polynomials as P = pe Mee, Pd * The n-fold product of pe . viPe . 
with itself is in F[X] by assumption and is therefore invariant under Gal(K/F). 
Consequently for each o € Gal(K/F) and each P;, there exists some P; such that 
P; =o(/;). It follows that if H is the subgroup of G = Gal(K/F) fixing Pj, then 


Chapter IIT 671 


QO = [Ioneg ju 0 P, is the product of distinct irreducible factors of P and hence 
divides P. The polynomial Q is fixed by every member of G and hence is monic 
in F[X]. Thus Q # P. Then Q” is in F[X], and hence (P/Q)” is in F[X]. The 
fact that P is not in F[X] implies that Q 4 P. Therefore deg(P/Q) < deg P. By 
the minimal choice of deg P, P/Q is in F[X]. Therefore P = (P/Q) Q is in F[X], 
contradiction. 

16. For a matrix m with entries in a field, passing to a larger field does not change 
det(X 1 — m). Suppose we start with two finite Galois extensions K; and K2 of F 
that split A. Let K, be a splitting field for a polynomial g; € F[X], and let K2 be 
a splitting field for g2 € F[X]. Define K to be a splitting field for g1g2. Then K is 
a finite Galois extension of F’, and we can regard it as containing both K; and K2. 
Applying the first sentence of this paragraph first to K, and K and then to K2 and K, 
we see that the reduced characteristic polynomial is the same over K as it is over K2. 

17. The formulas for Nrd4;r(ab) and Nrd4;r(1) follow from properties of 
determinants. From Problem 14 we observe that deta = (-1)™ det(—a) and 
det(—y(a ® 1)) = (—1)" det(y(a ® 1)). Substituting X¥ = O into the formula 
therefore gives us N4/ (a) = deta = (—1)" det(—a) = (—1) det(—g(a@1))" = 
(-)™ (-1)")" det(g(a ® 1))" = det(y(a @ 1))” = Nrda/r(a)". If a is invert- 
ible, then 1 = Nrdayr(1) = Nrda;r(aa~!) = Nrd4/r(a)Nrd(a~!) shows that 
Nrda/r (a) is nonzero. Conversely if Nrd4/r(a) 4 0, then Nrd4/r (a) # 0 and hence 
det L(a) £ 0. If P(X) is the algebra polynomial of L(a), then the Cayley—Hamilton 
Theorem shows that P(L(a)) = 0. Since det L(a) 4 0, P(X) has a nonzero constant 
term. Therefore we can separate the constant term in the equation P(L(a)) = 0 to 
exhibit an identity of the form L(a)Q(L(a)) = 1 for some polynomial Q(X), and 
the element Q(a) is a 2-sided inverse to a in A. This proves (a), and the conclusion 
about division algebras is immediate. 


18. The definition gives 


m(dxp) = > ud)a(u, p)Eu.up; 
mn 
m(cx;) = Ya(c)ao, T)Eg ot, 


m((dxp)(cxz)) = m(dp(c)a(p, t)Xpr) = TL u(dp()a(p, ta, PT) Ep,ypr- 
uw 


Also we have 


m(dxp)m(dxp) = Dd) w(d)a(L, p)o(c)ao, T) Ey ppEo,or 
[Lo 


= Vu@dpp(cja(u, p)a(up, T) Ey pt: 
im 


This matches m((dxp) (cxr)) by the cocycle relation for a. 


672 Hints for Solutions of Problems 


For the reduced norm we have two one-one F algebra homomorphisms of A into 
M,,(K), one via the mapping m above and one by the embedding A > A @pr 1 © 
A@r K = M,(K), and these are conjugate by the Skolem—Noether Theorem. Hence 
the determinant gives the same result in the two cases. The determinant in the second 
case gives the reduced norm, and hence it must give the reduced norm in the first case. 


19. The algebra H can be realized as all complex matrices x = e ’ ), and 
Nrdyyr(x) = lal? + |B|? and Naya(x) = (lo|? + |B|?)? as a special case of 
Problem 18. 

20. Let D be a finite-dimensional central division algebra over F, say with 


dimr D = n*. Choose a basis {xz} of D over F, and expand elements of D 


asx = Sam éjxj. Thefunction P(r; 1.362) = Nrdpje( oy Cp) is easily 
checked to be a homogeneous polynomial of degree n in n” variables, and condition 
(C1) says that it has a nontrivial zero ifn < n?. In this case the corresponding member 
x of D would be a nonzero element of D that fails to be invertible, and there is no such 
element. We conclude that n < n? is false, and that means that n = 1. Therefore F 


is the only finite-dimensional central division algebra over F’, and B(F) = 0. 


Chapter IV 


1. For (a), every free abelian group of finite rank is in the category, and such 
groups provide enough projectives. 

Let 1 = F @T be a decomposition of an injective J as the direct sum of a free 
abelian group F of rank & and a torsion group T. The sequence 0 — F @T > 
2F ®T — (Z/2Z)* - 0 is exact but not split unless k = 0, and thus F = 0. Thus 
every injective in the category is a finite group, and no infinite group in the category 
embeds into an injective. 

For (b), every abelian group and in particular every torsion abelian group is a 
subgroup of a divisible group. The torsion subgroup of the divisible group is still 
divisible and is still an injective, and thus every group in the category embeds in an 
injective in the category. 

Let P be a projective in the category mapping onto Z/2Z = {0, 1} by a homo- 
morphism Tt, and let x be an element of P with t(x) = 1. If g is a generator of a 
cyclic group G of order 2‘, then there is a homomorphism of G onto Z/2Z with 
v(g) = T(x) = 1. Since P is projective, there exists a homomorphism o : P > G 
with go = tT, and then we have | = t(x) = go(x). Then o(x) = g” for some odd 
integer m, and this has order 2*. Hence x has order at least 2*. Since k is arbitrary, 
x must have infinite order. But all groups in the category are torsion groups, and P 
therefore cannot exist. 


2. Let p be a prime, and let C be the category of all abelian groups that are the 
underlying additive group of a vector space over the field of p elements. This category 


Chapter IV 673 


coincides with the category of all direct sums of copies of Z/pZ. Every such abelian 
group is projective and injective for the category. 


3. Every unital left R module is the direct sum of simple R modules. Hence every 
short exact sequence splits, and every module is both projective and injective for Cr. 


4. For (a), let J be injective. Given x € J anda £0in R, let B=C = R, let 
t: R— Ihave t(r) = rx, and letg: R > R have g(r) = ra. Setting up Figure 
4.4, we obtaino : R > I witht =o @. If we put y = o(1) and evaluate both sides 
at 1, then we obtain x = t(1) = o (g@(1)) = o(a) = ao (1) = ay, as required. 

For (b), suppose that the unital left R module / is divisible. Suppose that J is an 
ideal of R, and write J = (a). Let g : J > I be an R homomorphism. Since J is 
divisible, there exists y in J withay = g(a). Then ¢ extends to the R homomorphism 
® with ®(1) = y. By Proposition 4.15, J is injective. 


5. Proposition 4.20 shows that there exists an injective J) containing an isomorphic 
copy M of M. Problem 4 shows that Jp is divisible, and hence J; = Jp/M is divisible. 
By Problem 4, /; is injective. Then0 > M —> Ip > I, — Oisaninjective resolution 
of M. 


6. If amodule M in C is given, we form the appropriate kind of resolution X in C 
needed to compute the derived functors of G, and the same X will be appropriate for 
computing the derived functors of F o G. The derived functors of G come from the 
homology or cohomology of G(X) with G(M) removed, and the derived functors of 
F oGcome similarly from F(G(X)). Thus the result follows from Proposition 4.4. 


7. If a module M in C is given, we form the appropriate kind of resolution X in C 
needed to compute the derived functors of Go F on M. Then F(X) is the appropriate 
kind of resolution for computing the derived functors of G on F'(M), and the result 
follows. 


8. For n odd, H”(G, M) is the cohomology of the complex 
Homzg(ZG, M) <— Homzg(ZG, M) <— Homzc(ZG, M), 

while for n even, H"(G, M) is the cohomology of the complex 
Homzg (ZG, M) <— Homzg(ZG, M) —- Homzg(ZG, M). 


This proves the isomorphisms concerning cohomology. For n odd H,(G, M) is the 
homology of the complex 


ZG ®zg M > ZG @2g M —> ZG @z6 M, 
while for n even, H,(G, M) is the homology of the complex 


LG Sic M — > LG S76 M => ZG exe M. 


This proves the isomorphisms concerning homology. 


674 Hints for Solutions of Problems 


9. For (a), let 74g : Homp(F(A), B) — Home(A, G(B)) be the natural isomor- 
phism. Naturality in B says for any yy : B > B’ that we have 


Home(14, G(w)) o Tap = Tap 0 Homp(1 F(A), W) 


on Homp(F (A), B). Let P be projective in C. We are to prove that F(P) is 
projective in D, thus to prove that Homp(F'(P), -) is exact. We need to show that 
whenever y : B > B’ is onto in D, then Homp(1-,p), W) is onto. By hypothesis, 
G(w) : G(B) > G(B’) is onto in C. The displayed equation with A = P has 
Home(1p, G(v)) onto, and Tpg and Tpp are given as isomorphisms. Therefore 
Homp(1F :p), w) is onto, as we were to show. The proof of (b) is similar. 

10. Conclusion (a) follows from the natural isomorphism Homs(P;. A,B) = 
Homs(S ®r A, B) = Home(A, FRB), Conclusion (b) follows from Problem 9a 
with F = Ps and G = Fe , since Fe is exact and therefore carries onto maps to onto 
maps. For (c), Ps A is given by the tensor product S @ x A, and this tensor product 
is an exact functor of A if S is projective as a right R module, by Proposition 4.19a. 

For (d), part (c) says that M bh Pe M is an exact functor. Taking it to be F in 
Problem 7a and G to be Homs(-, NV), we have Ext{(P3M, N) = G*(F(M)). Prob- 
lem 7a says that this is equal to (G o F)*. Since (G o F)(M) = Homs(P8M, N) = 
Homa (M, F§N) has (Go F)‘(M) = Ext(M, FEN), we obtain Ext{(P3 M, N) = 
Ext’. (M, FEN). 

For (e), (b) shows that the chain complex P3 X is projective over Ps M, and we 
are assuming that Y is exact (and projective) over P? M. Theorem 4.12 says that the 
identity map on P. M extends to a chain map f : Ps X — Y that is unique up to 
homotopy. Dropping the terms in degree —1 and applying the functor Homs(-, NV) 
to the diagram gives us a cochain map from the complex Homs(Y, NV) to the complex 
Homs(Pp X, N) = Homr(X, FEN ). Thus we get homomorphisms on cohomology 
Ext;(P&M, N) > Ext(M, FEN). 

11. Conclusion (a) follows from the natural isomorphisms Homs(A, IRB) a 
Homs(A, Home(S, B)) = Home(S @s A, B) = Homa (F§ A, B). Conclusion (b) 
follows from Problem 9b because Fe is exact and therefore carries one-one maps 
to one-one maps. For (c), 1° = Hom,(S, -) is exact if S is projective as a right R 
module, by Proposition 4.19a. 

For (d), part (c) says that M bh 13M is an exact functor. Taking it to be F 
in Problem 7b and G to be Homs(M, -), we have Ext{(M, 13N) = G*(F(N)). 
Problem 7b says that this is equal to (GoF)*. Since (GoF)(N) = Homs(M, IRN) = 
Homa (F% M, N) has (G o F)‘(M) = Ext, (FEM, N), we obtain Exth(N, 12.) = 
Ext, (F%M, N). 

For (e), (b) shows that the cochain complex 15x is injective over IRN , and we 
are assuming that Y is exact (and injective) over IRN . Theorem 4.16 says that the 
identity map on J 2 N extends to a cochain map f : Y > I ox that is unique up to 


Chapter IV 675 


homotopy. Dropping the terms in degree —1 and applying the functor Homs(M, - ) 
to the diagram gives us a cochain map from the complex Homs(M, Y) to the complex 
Homs(M, I és X) = Home Ge M, X). Thus we get homomorphisms on cohomology 
Exts(M, 1&N) > Ext’g(F§M, N). 


12. For (a), the definition of , is 
(Bg P)(B1,---+ Sq) = OU, 81, 8182, --+, B1°** Bq) 


for p € Homzg(F,, M). Putting f = Oy¢ gives (p* f)(g1,.--, 8g) = P* (Pgy) = 
®,(~ 0 p) = (B,@) © p, as asserted. 

For inflation the groups are (G, G’) = (G, G/H), and the map op is the quo- 
tient map; the effect is given by (Inf f)(g1,..., gq) = f(gi,..., &qH) for f in 
C1(G/H, M"). For restriction the groups are (G, G’) = (H,G), and the map 
is the inclusion; the effect is given by (Resy)(M1,...,hg) = W(h1,...,Aq) for 
w €C1(G, M). 

For (b), let f be in C'(G/H,M"). Then Res(Inf(f))(h) = Inf(f)(h) = 
f(hH) = f(A). The condition for f to be a cocycle is that 5; f = 0, ie., that 
f(uv) = fq) +u(f (v)) for u and v in G/H. Taking u and v to be the identity coset 
H shows that f(H) = 0. 

For (c), let f € Cl\(G/H, M") bea cocycle. Then Inf(f)(g) = f(gH). If 
this is a coboundary in cl(G, M), then there exists Y € M with dow = f, ie., 
with f(gH) = gw — w for all g. The left side depends only on the coset gH, and 
hence so must the right side. Then it follows that gh = gy for all h € H and that 
wv isin M". Then the formula f(gH) = gw — w exhibits f as a coboundary in 
C!(G/H, M"). 

For (d), let f be a cocycle in C!(G, M) such that Resf is a coboundary in 
C!(H, M). The formula is (Res f(A) = f(A), and the coboundary condition shows 
that there is some yw € M” with f(h) = hw — w forh € H. Since y is in M", 
f(h) = Oforallh € H. The cocycle condition on f is that f(uv) = f(u)+u(f(v)) 
for all wu and v in G. Taking v to be in H shows that f(gh) = f(g) for allh € H. 
Taking instead u to be in H shows that f(hg) = h(f(g)) for allh € H. Since H is 
normal, h(f(g)) = f(g) for all h € H. Therefore f takes values in M® and is Inf 
of the cocycle f in C!'(G/H, M") given by f(gH) = f(g). 


13. For (a), we have (g09m)(8) = Ym(880) = 880M = Pgom(Z), and m +> Pm is 
a ZG homomorphism. Suppose that g,, = 0. Then gm = 0 for all g and in particular 
for g = 1. Therefore m = 0, and m +> @ » is one-one. Then it follows that the 
sequence is exact. 

For (b), we know that ZG as an abelian group is free abelian. Then Problem 11d 
shows that H*(G, B) =Ext!,,(Z, B) =Ext’,,(Z, 12° (FZ, M)) = Ext’ (Z, FZ,M). 
Since Homz(Z, - ) is exact from Cz to itself, Ext, (Z, ae M) =O fork > 1. 

For (c), a Z basis of ZG consists of all 1-tuples (g) with g € G, and a Z basis of 
ZA consists of all (1) with h € H. Let {v} be a set of representatives of the cosets of 


676 Hints for Solutions of Problems 


G/H, and let A be the free abelian group on {v}. The Z-bilinear map (v, (1))  (vh) 
extends to a homomorphism of A @z ZH into ZG that is manifestly onto, and it is 
one-one because }* nj; (v;h;) = 0 implies n; = 0 for alli. Thus it is an isomorphism. 

For (d), use of (c) gives FZ4 B = FZ# Homz(ZG, M) = Homz(FZ4/ (ZG), M) 
= Homz(A®zZH, M) ~ Homz(ZH, Homz(A, M)), andthen H*(H, FZ B) = 0 
for k > 1 by the same argument as in (b). 

For (e), the long exact sequence for Ext;,(Z, -) that comes from the short exact 
sequence in (a) shows that 0 —> H(A, M) > H(A, B) > H(A, N) > 
H'(H, M) is exact. The right member is assumed to be 0, and the three middle 
members are isomorphic to M 4 BH" and N#. 

For (f), consider the Z bilinear map (1, (g)) (gH) of Zx ZG into Z(G/#H), and 
extend it to a Z linear map of Z @z ZG into Z(G/H). The group H acts trivially on 
Z on the right, and it acts on Z(G/H) by left translation. Let h be in H. The passage 
Zx ZG —> Z(G/HA) has (1h, (g)) & (gH) and (1,h(g)) & h(gH) = (gH); 
thus the group homomorphism Z @z ZG — Z(G/#H) descends to a homomorphism 
of Z ®zy ZG into Z(G/H). This is certainly onto. To see that it is one-one, let 
5; 211 ® (g;)  O. Then >>; nj(g;H) = 0, and for each coset representative v 
in G, Dpevn Mi(gi) = 0. Sod; ni(h; 'v) = 0, and (37; ni(h;'))(v) = 0. Then 
>; nj(h;') = 0 in ZH because (v) is invertible in ZG, and it follows that the map 
is one-one. 

For (g), (f) gives B? = Homzy(Z, Homz(ZG, M)) = Homz(Z@zy ZG, M) = 
Homz(Z(G/H), M), and the same argument as in (b) shows that Hk (G/H, BF) =0 
fork > 1. 

Conclusion (h) is immediate because g > 2 and because all the cohomology 
associated with B has been shown to be 0 in degrees > 1. 

The commutativity in conclusion (i) follows because the inflation and restriction 
mappings are clearly functorial. The vertical mappings have been shown to be 
isomorphisms in (h). To see via induction that the top row is exact, we have to 
verify that H‘(H, N) = 0 fork < q — 2; but H*(H, N) = H**!(H, M) for all 
k > 1, and A‘ (A, M) is assumed to be 0 fork + 1 < g — 1. Therefore the bottom 
row is exact, and the induction is complete. 

14-16. These problems are routine verifications. 

17. Part (a) follows because R ® p A is naturally isomorphic to A. For (b), F@rA 
= Dyes (Fs @r A) and 1 @ f corresponds to G (1f,® f). The values of the various 
R homomorphisms are in the various spaces F; ®r B, whose sum is direct, and thus 
the kernel of 1- ® f is the direct sum of the kernels. Then (b) follows. For (c), we 
see from (a) and (b) that free R modules are flat. In Cr, every projective is a direct 
summand of a free module, and thus (c) follows by a second application of (b). 

18. Consider 1® f: M@r A> M®OzB. Any element of ker(1 ® f) is a finite 
sum )*\m; ® a;, and this lies in ker((1 @ f )| Mp? where F is the finite set of indices 
in question. Thus ker(1 @ f) 4 0 implies ker((1 @ Pligd # 0 for some F’. The 


Chapter IV 677 


converse is immediate because ker((1 @ f )| My? C ker(1 ® f) for all F. 


19. The long exact sequence for tensor product over R is of the form 
-++—> Tori(A, F) > Tori'(A, B) > A@r K > A@r F > A@R BO, 


and Tor* (A, F) = 0 because F is projective for Cr. This establishes the exactness 
of the sequence in the problem. If A is flat, then 


0 > Tort (A, B) > A@r K > A@xr F > A@RB>0 


is exact for each B, and Tor (A, B) must be 0 for each B. Conversely if Tor* (A, B) 
is 0 for each B, then A @  (-) is an exact functor by Proposition 4.3. Hence A is flat 
by definition. 


20. On the one hand, the long exact sequence associated to tensoring the short 
exact sequence given in (a) by B is of the form 


0 > Tori (M, B) > Torf'(T(M), B) > F@rB > M@rB > T(M)@RrB > 0, 


since F free implies Tort (F , B) = 0. On the other hand, the given short exact 
sequence splits, and tensoring it by B must directly produce a short exact sequence 


O> F@rB> MSrRB>T(M) OR B- 0. 
Thus ker(F ®r B > M &pr B) = 0, and we must therefore have 
image(Tork (T(M), B) > F @pr B) = ker(F @r B > M @p B) = 0. 


Consequently 0 > Tor} (M, B) > Tort (T(M), B) — Ois exact. This proves (a). 

For (b), Problem 18 shows that M is flat if and only if each Mr is flat, and 
(a) in combination with Problem 19 shows that each Mf is flat if and only if each 
T (Mf) is flat. Now suppose that M is flat, so that T(M-) is flat for each finite 
subset F of M. This is true in particular for each finite subset F’ of T(M), and 
T (Mr) = Mr = (T(M))F’. Hence Problem 18 shows that T (M) is flat. Conversely 
suppose that T(M) is flat. Then T(M)-- is flat for each finite subset F’ of T(M). 
Let F be a finite subset of M. Then Mr is a finitely generated R submodule, and 
the structure theorem shows that T(M,) is finitely generated. Let F’ be a set of 
generators for it. Then T(Mr) = Mp = T(M)r-. This is flat by Problem 18, since 
T (M) is flat, and the first sentence of this paragraph allows us to conclude that M is 
flat. 

For (c), T(M) 4 0 means that am = 0 for some nonzero a € R andm € M. 
Let i : (a) > R be the inclusion, which is one-one. Theni @ 1: (a) @r M > 
R @r M = M has (i ® 1)(a ® m) = am = 0. Thus the one-one map 7 is carried to 
the map i @ | that is not one-one, and tensoring with M is not exact. So M is not flat. 

For (d), if M is flat, then T(M) = 0 by (c). Conversely if T(M) = 0, then T(M) 
is flat, and (b) shows that M is flat. 


678 Hints for Solutions of Problems 


21. Since 0), , and 07 , both lower p + q by 1, they both carry Ep+g to Ep+q-1. 


Also, the hypotheses give ig + o54)" = D1 -1,g%p.¢ + DO. ++ OF Poo nw + 
a7 ai d7,q = 0, and we have a chain complex. 


22. We compute that 0-12 Ony = (a@p_1 ® 1)(@p @ 1) = ap_-1ap) @1 = 
0, 4 1g + Fy g8jg = (pt @ IND @ By) + (—1)?' ® Bi) 
(ap @ 1) = (—1)?(@p ® By) — (—)L)? (ap ® By) = O, and that Oil ope = 
(-1)? ® Bg-1)(—-1)? 1 ® By) = 1 ® Bg-1Bg = 9. 

23. The formulas for oe and Opa show that ker Ong = kera@p @r Dg and that 
ker), = Cp @r ker By. Since 0), ,Ep.q and 0) | Ep. lie in independent spaces, 
ker(d, +97 4) = ker 0, gMker OF = kerap@pker By. Similarly Grady (Ep+i,q) = 


&p+1(Cp+1) @r Dg and dy 7, (Ep.g+1) = Cp @r Bg+1(Dg+1), and hence 
image(9,44.¢ + 95.441) = &pti(Cp+1) @r Dg + Cp @r Bg+i(Dg4)- 


Thus if c is in Cp, d is in Dg, c’ is in &p41(Cp41), and d’ is in Bg+1(Dg+1), then 
(O34 + OF a ((c +c’) @ (d+ d')) is the sum of (ng + OF gic ® d) and three 
terms that are in image(d’,,, , + 9% ,,,). Consequently we obtain a well-defined 
homomorphism of Hp(C) @r Hg(D) into Hp+4q(£). 

24. Let a’ and 0” be the boundary operators; these satisfy 0’0” = —d”d’. Leta 
be a cycle in E_1 x, ie., let 0a = 0. Since d’a = 0, the exactness for 0’ produces 
cok € Eo, with a = d'cox. Since 0”a = 0, this has 0/0”cox = —0”0'con = 
—d"a = 0. Now suppose inductively oni > 0 that j > 0 is defined byi + j = k and 
that c;,; € Ej,; is given with 0’d”c;,; = 0. By the assumed exactness, 0'0”c;,; = 0 
implies 0”cj,; = 0'ci+1,;-1 for some cj+1,j;-1 € Ei+1,j-1, and then 00’ cj41,j;-1 = 
—0"0'ci41,j)-1 = —0"0"c;,; = 0. The induction leads us nonuniquely to cx,o € Ex,0 
such that 0’0’cz,.9 = 0. Defineb € Ex,_; by b = 0”cg.9, and then d'b = 0. The result 
of the construction is therefore that we pass nonuniquely from the cocycle a € E_, x 
for 0” to acocycle b € Ex._, for 0’. 

Inverting the steps and the choices, we see that we can pass from b back toa. Thus if 
we can address the nonuniqueness, then the isomorphism in homology will have been 
established. We are to show that ifa € E_1,, at the start is a boundary relative to 0”, 
then any system of choices leads to a result b € Ex, that is a boundary for a’. Since 
ais assumed to be a boundary for a”, a = 0”a' witha’ € E_,,441. The element a’ has 
d’a’ = 0, and thus a’ = —d'ao,441 for some ao,441 € Fo,441. Meanwhile, the above 
construction makes a = 0'co,x. So 0'0"ao,441 = —9”"0'ao441 = 0a’ =a =9'cOK. 
By exactness, co, — 0”d0,441 = 0’b1 x for some bj, € E,,x. This proves that co,x 
is of the form cox = 9 ao.441 + 9'by,~ with ao.n41 € Eo.x41 and by, € E,,x. (Note 
that this form for co, already implies that 0’0’co,,4 = 0.) 

Now suppose inductively oni > O that 7 > O is defined byi + j = k and 
that cj; € E;,; is given with cj; = 0”aj,;+1 + 0’bj41,;. The constructed element 
Ci41,j-1 € Ej+1,j-1 has a Ci, j = O'Ci441, j—1 for some cj+1,j;-1 € Fj+1,j-1. Thus 


Chapter IV 679 


0'Ci41,j—1 = 0 0'bi41, j = —0' 0d" bi41,;; and Ci+1,j—-1 + Obit, j = 0’ bj+2,j-1- If 
we put aj41,; = —i41,;, then we have cj41,;-1 = 0” ai41,; + O’bi+2, ;-1, and the 
induction goes through to i = k. Consequently any choice of c;,9 obtained starting 
from the boundary a is of the form cg,9 = 0” ax,1 +. 0’bx+1,0. The final step is to define 
b = 0"cx., and then we have b = 0”0/bg41,.9 = —9'0"by41,0, and b is exhibited as 
a boundary relative to 0’. 

25. Since each Cy is projective for p > 0, Cp @r D is exact. Similarly C @r Dg is 
exact for g > 0. The hypotheses of Problem 24 are satisfied, and the two homologies 
match. 

26. Ho(C) = Ho(C’) = Ho(D) = Z/2Z, and Hp(C) = Hp(C’) = Hp(D) = 0 
for p # 0. Ho(C @z D) = Ho(C' @z D) = Z/2Z, H\(C @z D) = 0 and 
Ay(C' @z D) = Z/2Z, Hy(C ®z D) = Hp(C' ®z D) = 0 for p ¢ {0, 1}. 

27. Let Zp = ker 0, © Cp, Bp = image 41 © Cp, and B, = Bp-. Since R 
is a principal ideal domain, Problem 20 shows that flat is equivalent to torsion free. 
Modules of the complex C are flat by assumption, hence torsion free. Modules of Z 
and B’ are R submodules of these, hence are torsion free, hence are flat. 


28. The long exact sequence in homology shows that 
Tor (B’, D) > Z@rD>C@rD— B' @rD>0 


is exact. Since B’ is flat, Problem 19 shows that Tork (B’, D) =0. 

29. For (a), the boundary map on B, ®p Dg in B’ @r Dis 0’ @1+(—1)?(1@ 9d"), 
and 0’ = 0 on boundaries in By. 

For (b), tensoring with B’ is an exact functor, since B’ is flat. Therefore the 


s a” = ee 
exactness of 0 > Z > D> B’ > 0 implies the exactness of 


O18 G27), = Bop Dig SB Oh Be 0 


for each n. From the exactness of this sequence, we can read off that ker(1 @ 0”), 
within (B’@p D), is (B’@pZ)n and that image(1@d”), on (B’@pD), is (B'@rB )n. 
which is the same thing as (B’ @r B)n—1. 

For (c), the results of (b) show that 


H,(B' @p D) = ker(1 @ 4"),/image(1 @ 8”)n41 = (BY @x Z)n/(B’ @ B)n. 


Since tensoring with B’ is exact, the exactness of 0 > B—- Z-— H(D)>0 
implies the exactness of 


0 => B’ @r B= B' @rZ— B @r H(D)> 0 


in each degree. Thus B’ ®r H(D) = (B’ @p Z)/(B’ @p B), and H,(B’ @p D) 
(B’ @ H(D))n = (B @r H(D)n-1. 
Part (d) is handled in a fashion similar to (c). 


680 Hints for Solutions of Problems 


30. For (a), Tor (Z, H(D)) = 0 because Z is flat. 

In (b), comparison of the exact sequence with ker w,_; with the exact sequence 
displayed before part (a) (but with n replaced by n — 1) shows that ker@,—1 is 
isomorphic to Tort (H(C), H(D))n_1. Substituting for ker w,—; and incorporating 
the isomorphism into the mapping into H,(B’ ®p D) leads to 6) _, as the one-one 
mapping. 

In (c), we have 


coker(t @ 1) = Ay, (C @pr D)/ image(t, ® 1) = Ay(C @pr D)/ker(d,, ® 1) 
= image(d/, @ 1) = kerw,_; = Tor’ (A(C), H(D))n-1. 


The composition of maps leading from H,(C @pr D) to H,(B’ @pr D) has to be 
a) @ 1, and thus 6’, Bn—-1 = 9), ® 1. The map By_1, apart from isomorphisms, is 
onto because q was constructed as onto. 

Part (d) is completely analogous, and the resulting map @, is one-one. 

For (e), we know that a is one-one and that 6 is onto. Also, we have Bas Bn—1n a, 
= (0) ® 1); @ 1) = 0. Since f’_, is one-one and a’, is onto, B,-1a, = 0. 
Finally suppose that x is in ker B,_,. Then x is in ker(6’_, 8,1) = ker(0/, ® 1) = 
image(t, ® 1) = image(a,a/,) = imagea,. This completes the proof of exactness. 

31. This is immediate. 


32. Let X = {X,} and Y = {Y,}. Then Morph(X, Y) is the subgroup of 
Teo) Hom(Xn, Yn) consisting of those elements in the product satisfying the chain 
map conditions. A zero object is any tuple of 0’s, and certainly product and coproduct 
make sense. One readily verifies that the tuple of kernels of a chain map furnishes a 
kernel for a chain map and that the tuple of cokernels furnishes a cokernel. 


33. The additional objects and morphisms at the top of the extended diagram are 
Co = 2Z/8Z, Bo = Z, k given by 2 mod 8 +> 2 mod 8, k given by x 2, v given by 
1 +> 2 mod 8, and @ given by x 4. Since the composition of k followed by B = x 2 
is not 0, (Bo, k) cannot be the kernel of 8. 

The additional objects and morphisms at the bottom of the extended diagram are 
Ap = Z/4Z, By = Z/16Z, p given by 1 +> 1 mod 4, P given by 1+ 1 mod 16, g’ 
given by | mod 41+> 4 mod 16, and w’ given by | mod 16+ 1| mod 4. 


34. We give the argument only for Hom(M, -). LetO > A eB x C > 0be 
a given exact sequence, and form the sequence 
Hom(1, Hom(1, 
0: =} Hen, Ayes Home By es HORE, 
We are to show that Hom(1, g) is one-one and that exactness holds at Hom(M, B). 
If o is in Hom(M, A) with Hom(1, g)(o) = 0, then go = 0, and it follows that 
o = 0 because ¢ is a monomorphism. 


For the exactness at Hom(M, B), we use Theorem 4.42e. We know immediately 
that Hom(1, %) Hom(1, g) = Hom(1, wg) = Hom(1, 0) = 0. Thus suppose that 


Chapter V 681 


T €m Hom(M, B) has Hom(1, w)t = 0. This condition means that wt = 0. Since 
the given sequence is exact, Theorem 4.42e produces some t’ €,, A with gt’ = T. 
In turn, this says that Hom(1, y)t’ = t. By Theorem 4.42, we have exactness at 
Hom(M, B). 

35. We give the proof only that the splitting of exact sequences as indicated 
implies that P is projective. Thus suppose that a morphism t € Hom(P, B) and an 
epimorphism y ¢ Hom(C, B) are given. We are to produce 0 € Hom(P, C) with 
t = wo. Let (W, W, T) be a pullback of (w, tT). Then tw = WT, and Proposition 
4.40 shows that v is an epimorphism. Then it follows that 


Perey er U 
0 — domain(ker yw) ey Ww ais P>0O 


is exact, and it must split by assumption. Thus there exists p € Hom(P, W) with 
wo =p. Puto =Tp. Then Wo = Wtp =twWp =Tl1p =T, as required. 


Chapter V 


1. If € is aroot of F(X), then the given formula shows that D(é) is —23 and —31 
in the two cases. These contain no square factor and therefore equal Dx in the two 
cases. 

2. For (a), let G(X) = F(X + 3) = X?— 4X + Y%. Then F(X) and G(X) 
have the same discriminant, and the discriminant for G(X) is given by the formula 
of Problem 1. It is —44. 

For (b), let x = a + bE + c&* be given with a, b,c all in {0, 1}. The matrix of 
left-by-x in the ordered basis (1, €, €*) works out to be 


a —2c —2b—4c 
boa =26 ‘ 
c b+2c a+2b+4c 


a + 2a*(b + 4c) + 4c? — 2b(b + 2c)? + 4ac(b + 2c) + 2bce(a + 2b + 4c). 


and the determinant of it is 


For x to be twice an algebraic integer, this determinant, which is the norm of x, has 
to be = 0 mod 8. All the terms are even except possibly the first, and thus a has to be 
even. That is,a = 0. The determinant then reduces to 4c? —2b(b+2c)*+4bc(b+2c). 
All terms here are divisible by 4 except possibly —2b?. Thus b must be even. That 
is, b = 0. The determinant reduces in this case to 4c}. For this to be divisible by 8,¢ 
must be even. That is, c = 0. Proposition 5.2 consequently says that a further factor 
of 2? cannot be eliminated from the discriminant. 


682 Hints for Solutions of Problems 


3. For (a), Theorem 5.21 and the remarks after it show that every equivalence 
class contains an ideal whose norm is < (0. 283) Dy”. Z Proposition 5.8 shows that 
Dx = 3° = 243. Thus every equivalence class contains an ideal with norm < 4. 

Conclusion (b) is immediate from Theorem 5.6 with F(X) = X? —3. Conclusion 
(c) follows because (4/3 — 1)(./9+ ¥34+1) = (\/3)3 —1 = 3-1 = 2. Conclusion 
(d) is immediate from Proposition 5.10d. 

For (e), any nonzero ideal is the product of powers of prime ideals associated with 
the various prime numbers. The ones corresponding to the prime numbers 2 and 3 are 
principal ideals by (b), (c), and (d). These are the only ones that need to be checked, 
according to (a). Thus every nonzero ideal is principal. 

4. Conclusion (a) is immediate from Theorem 5.6, since X? — 7 factors modulo 2 
as (X + 1)(X? + X + 1). For (b), we show that no element x = a + bV7+c¢/49 
has norm +2. Left multiplication by x carries 1 to a + b\/7 +. c¥/49, carries </7 to 
TotaV/7+ b</49, and carries 3/49 to 7b + 7c4/7 + ax/49. Thus its matrix is 


a Tc 7b 
batTc]}. 
cba 


The determinant is a> + 49c? + 7b? — 21abc, which is congruent modulo 7 to a?. 
Modulo 7, the cubes are 0 and +1, and thus the congruence a>? = +2 mod 7 has no 
solution. 

5. Since the element /—1 + ./—S has degree 4 over Q, the minimal polynomial 
has degree 4. The product of (X — (+./—1 + /—5)) and the Galois transforms 

— (+/=1 — V-3)), (XK — (-V=1 + V=5)), and (X — (—/=T1 — V=3)) is 
X* 4+ 12X? + 16, which is in Z[X]. 

6. The minimal polynomial of € = 5(./—1+ /—5) is H(X) = X44+27-712X7 + 
2-416 = X44 3X? +1 with |D(é)| = |Nx/o(H’(é))|. Here 1) = = 4x34 
6X = 2(2X? +3). Since &* + 3&7 + 1 = 0, we have &? = —3 + 4./5; thus 
267 +3 = +,/5. So |D(é)| = INLjo(+2V5)I. The four conjugates of \/5 are 
+4/5 twice and —4/5 twice, and the norm is the product of the four conjugates. Thus 
ID) = |Nijo(42V5)| = 2452. 

7. These follow immediately by applying Theorem 5.6 to the indicated prime, 2 
or 5, and the respective polynomials: X* + 5, X* + X — 1, and X* +1. 

8. With Q C K’ CL, the (e, f, g) for L/Q has to be entry by entry > the triple for 
K’/Q. The triple for K’/Q is given in Problem 7b as (1, 2, 1) for p = 2. Similarly 
from Q C R” CL, the (e, f, g) for L/Q has to be > (2, 1, 1). Thus e > 2, f > 2, 
and g > 1. Since efg = 4, equality must hold throughout: (e, f, g) = (2, 2, 1). 

This proves (a). Similarly for (b), we must have (e, f, g) > (2, 1, 1) and (e, f, g) 
> (1, 1, 2). Thus (e, f, g) => @, 1, 2). Since efg = 4, (e, f, g) = @, 1, 2). 

ae In (a), Problem 8a shows that (2)T = P?, and we know that (2)R= 4. Then 

= (2)T = (2)RT = pre = (92T)(§22T). Since P is prime, P divides go2T. 
oS the equality P* = ((2T)* to hold, we must have P = g2T. 


Chapter V 683 


Similarly (5)T = P?P? and (5)R = 3. Then P?P? = (5)T = (5)RT = 
e3T = (£95 TY). Since P; and P> are prime, P; and P2 must divide 57. Therefore 
P| P2 = gsT. 

In (b), conclusion (a) shows that no prime ideal of R that divides (2)R or (5)R 
ramifies in T. Since D(&) is divisible by no prime numbers other than 2 and 5, 
Theorem 5.6 shows that no prime ideal (p) of Z ramifies in 7. Hence no prime ideal 
of R containing such a prime (p) of Z ramifies in T. 


10. Roots of unity must map to roots of unity under the embedding, and there 
are only two roots of unity within R. Hence there are no real-valued embeddings 
when p > 2. Thus the embeddings come in complex-conjugate pairs. The product 
0 (x)o (x) is positive for x > 0, and NxK/Q(x) is the product of these expressions over 
all such pairs. 


11. For (a), F(X) is the minimal polynomial of ¢ k when GCD(k, p) = 1. Then 
es — 1isaroot of G(X) = F(X + 1) of the correct degree, and therefore G(X) is 
the minimal polynomial of ¢* — 1. If H(X) is the field polynomial of an element 7, 
then Nx/o(n) = (—1)!°lA (0). In this instance [IK : Q] = p — 1 is even. Taking 
n = ¢* — 1, we obtain Nxg(¢* — 1) = G(0) = F(1) = p. 

For (b), €¢ — 1 divides ¢* — 1, and hence the quotient is in R. If / is chosen with 
lk = 1 mod p, then ¢ — 1 = ¢’* — 1, and ¢* — 1 divides ¢/* — 1. Therefore the 
reciprocal of (¢* — 1)/(¢ — 1) is in R. 

12. With F(X) and G(X) as in the previous problem, F’(¢*) = G/(¢* — 1). 
Here F(X) = (X? — 1)/(X — 1) makes G(X) = X7~![(X + 1)? — 1] and G/(X) = 
X~*[pX(X +1)?-! — (X +1)? +1]. Since ¢#” = 1, 


F’(¢*) = G'e¥-1) = *-1) 7 [pet -DekO Pe 41 = (CF -1)T pek PY. 


The result now follows from the formula D(¢ k) = F’ ( ae 


13. Continuing from the previous problem gives 
Nxyo(F'(5") = Ng — 17" pp?“ Ng ShP~P) = pP?. 


The result follows from the computation (—1)°?~DP-2)/2 D(¢*) = Nxg(D(c*)) = 
Nx/o(F'(¢*)) = p??. 

14. For (a), we have Ak = (1 —¢)k = Ee, (Die! and ¢& = (1 —a)k = 
Ee (-1)/ (iad . Conclusion (b) is a version of Problem 1 1b because the conjugates 
of ¢ are the powers ¢/ for 1 < j < p— 1. For (c), we have p = gece ga-¢)= 


Te = o)un = 16)?! TPT) ug, where uz = (1—¢*)/(1 —2). Each element 
ux 1s a unit by Problem 1 1c, and (c) follows. 


15. The identity (p)R = UI —- aaa is immediate from Problem 14c. The 
extension K/Q being Galois, we know that the prime decomposition of the ideal 


684 Hints for Solutions of Problems 


(p)R is of the form (p)R = Py --- Pe, where p — 1 = efg and f is the common 
value of all dimp, (R /P;). This latter fact says that no factorization of (p)R into 
proper ideals can have more than p — 1 factors, and p — | factors occur only if all 
factors are prime. In this case, (1 — ¢) is a proper ideal because Nx/g(1 — ¢) = p. 
Thus each factor (1 — ¢) is prime. 


16. Following Proposition 5.2, suppose that a; is an integer for each j with 
8 <j <k such that0 < aj < p—1,a, 40,a, = 1, and 


ash5 + ayAaSt) + ahi? 4 tag! + ag = pr 


with r in R. Subtracting all terms from the left side but the first and applying 
Problem 15 shows that a,A° lies in (A)°t!. Thus (a;)(A)® € (A)°+!. Canceling gives 
(as) © (A), and this inclusion is a contradiction because GCD(N ((a;)), N((A))) = 1. 


17. Each step toward a Z basis multiplies a discriminant by a square, and it is 
enough to prove that a primitive element € for K/Q lying in R has sgn D(&) = (—1)”. 
We are thus to compute the sign of Hie; (0; (€) — 9; (€))?. For a given pair (7, j), the 
factor (0; (€) — oj (& ))? is matched by its complex conjugate elsewhere in the product 
unless o; and oj; are both real or are complex conjugates of one another. The factor 
and its mate have a positive product, and pair with o; and o; both real contributes a 
positive square. If oj = ;, then o;(€) — o;(&) is purely imaginary, and its square is 
negative. Hence the sign is (—1)’”. 

18. Let g be in Gal(K/Q) = {oj,..., on}. Replacing each oj; by go; has the 
effect of permuting the columns of [o;(q;)]. If the permutation is even, then the 
terms contributing to P are the same before and after the permutation; otherwise they 
are interchanged. In either case, P + N and PN are fixed. Since P + N and PN 
are fixed by the Galois group, they are in Q. The entries o;(@;) of the matrix are in 
R, and thus P and N are in R. Consequently P + N and PN are in Z. The formula 
D(T) = (P+ N)* —4PN shows that D(T) = (P + N)? mod 4. Any square of a 
member of Z is congruent to 0 or 1 modulo 4, and the result follows. 

19. Let J be an ideal of S~'R. Proposition 8.47 of Basic Algebra shows that 
I = RO J isan ideal in R and that J = S~!J. Since h,..., I, isa complete set 
of representatives for the equivalence classes, al = bI; for some j with 1 < j </h. 
Let (a) and (b)s be the principal ideals of S$ —IR generated by a and b. The fact that 
u is in J; 1 S means that ST, = S—!R, and thus 


(a)sJ = S'(a)S11 = S"'(a)l = S'() 


= S-'(b)S-11, = S-1@)S1R = (b)s. " 


Hence J is principal. (In fact, the equality shows that aj = b for some j € J. 
Hence ba7! = jisanelement of J C S-!R, the principal ideal (ba—')s of S~!R is 
meaningful, and (ba~!)s C J. For the reverse inclusion let j € J be given, and use 
(*) to write aj = bx with x € S~!R. Then j = (ba~!)x shows that j is in (ba~!)s, 
and J ¢ (ba~')s.) 


Chapter V 685 


20. For (a), write ab = u*. Then a~! = u~*b exhibits a~! as in S~!R. For (b), 
if ua is a unit in S~!R, then u~"a~! = u~'c for some c € R. Hence ac = u'—", 
Since ac is in R and u is not, / —m = k withk > 0. Then a divides uk. 

21. For (a), write (uv) = Py! --- P/’. Then (uw) = (Pry seg (Pie = (bj! ++ by). 
Thus u? = DF . ‘bie for some unit ¢ in R, each b; divides u', and the conclusion 
follows from Problem 20a. 

For (b), we have (a)(b) = (u)* = pa ae Since a and Db are in R, this 
equality implies that (a) = P;'--- P/'. For each j, use the division algorithm to 
write rj = njh +t; with O < tj < h. Then P? = (PMP! = (bj)"P?, and 
consequently (a) = (d) Be oo ce as required, where d = Be be. 

The argument for (c) was given in parentheses at the end of the solution of 
Problem 19. 


22. Because of Problem 21d, we now have (a) = (d)(c;). Thus a = dcjeé for 
some unit ¢ in R. Since uk = ab = cjdbe, c; divides u* and is a unit in S~!R by 
Problem 20a. 

23. Problem 22 shows that any unit of S~!R is a product of a power of u by a 
product is bi , an element c;, and a unit e of R. Problem 21a shows that each b; is 
aunit in S~! R, and Problem 22 shows that each c; is a unit in S~!R. Thus (S~! R)* 
is generated by u, the finitely many elements b; and c;, and a finite set of generators 
of R*. (The group R* is finitely generated by the Dirichlet Unit Theorem.) 

24. G(4/&) = (646~* — 16§-? + 867! + 8) = BEEP + G7 — 26 +8) = 
8&-3F(€) = 0. The element 7 is in K, and it is exhibited as the root of a monic 
polynomial in Z[X]; therefore it is in R. 

25. For (a), 0 = F(é)/é = &* +& —24 8&7! = &? + —242y,. For (b), 
0 = G(n)/n = 1? —n+24+8/n = 1° —1 +2428. Solving the first equation for €7 
gives the first formula in the table, and solving the second equation for 7? gives the 
second formula in the table. The formula 7 = 4 is immediate from the definition 
n = 4/&. The formulas in the table together show that any integer polynomial in € 
and 7 reduces to a Z combination of 1, &, and 7. 

Conclusion (c) is clear. For (d), we have 7 = 1 — 5(é a4 €), and this is not in 
Z({1, €, €7}). For(e), we have D((1, &, €7)) = —27-503. Since the only square factor 
is 27, it follows that Z({1, £, €7}) has index 2 in Z({1, €, n}) and that D((1, €, )) = 
—503. This latter discriminant is square free and thus cannot be reduced further. 
Therefore Dx = —503, and {1,&, 7} is a Z basis of R. Finally the formula 7 = 
1 — 5(§? + &) shows that Z({1, &, m}) = Z({1, &, 5? + €)))- 


26. Application of g to £2 = & + 2 — 2n gives E = &. Similarly 7? = 7. The 
elements of a finite field of characteristic 2 fixed by the squaring map are 0 and 1. 
Hence é and 7 are in {0, 1}. Since F = y(R) is generated by the values of g on 1, 
€, and n, F has two elements. From £7 = 4, it follows that £7 = 0. Thus and 7 
cannot both be 1, and the only possibilities are the ones in the table. 


686 Hints for Solutions of Problems 


27. Define gy : R > F»2 oné and n by one of the lines of the table of Problem 26, 
and set g(1) = 1. Then ¢ extends to a well-defined additive homomorphism on 
Z({1, &, n}). We have to check that gy respects multiplication. It is enough to do so 
on additive generators. Thus we have to check that p(& 2) = (gy (€))?, that y(n?) = 
(y(n))’, and that p(n) = (y(€))(g()). Thus, for example, in the first one we 
want —g(€) + 29(1) — 2¢@() = (y(E))*. If we write the values of ¢ as triples 
corresponding to the three possible ’s, the left side is —(0, 1,0) + 21,1, 1) — 
2(0,0, 1) = (0, 1,0) mod 2, while the right side is (0, 1,0)* = (0, 1,0) mod 2. 
These match, and this relation is verified. The other two relations are verified in 
similar fashion. 


28. The norm of a kernel equals the number of elements in the image of the 
homomorphism, which is 2 in each case. Since each ideal has prime norm, the 
ideal is prime. Moreover, these ideals contain (2)R and hence all figure into the 
prime factorization of (2)R. On the other hand, we must have }°e; fj = 3 for 
the decomposition, and we have seen that there are at least three terms. So there 
are exactly three terms, and we must have e; = f; = 1 in each case. Therefore 
(2)R = Poo Pio Po,1- 

29. For (a), the elements listed are additive generators of the ideal in each case, 
and hence they are also ideal generators. For (b), 7 = n(€ + 1) — 2 - 2 shows that 
n is in the ideal (2, + 1). Thus (2,6 + 1,7) C ,&+4+ 1). The reverse inclusion 
is clear. In (c), the argument for (2, 7 + 1) is completely symmetric. Let us see that 
(2,&,) = (2, — n). The inclusion 2 is clear. For the inclusion C, we use the two 
formulas 


Cle enemy. -2—2n Hb 2-2) 4s, 
(3 + €)2 + (—n)(§ — n) = 6 + 2€ —44 (—2§ —2+n) =n. 


30. For (a), the field polynomial of 6 — gq is H(X + q), and so the norm of 6 — q is 
—H(0+q), as required. In (b), the first two formulas come from the field polynomials 
F(X) and G(X) of € and n, and the other formulas follow from (a). 

In (c), the fact that N((€é)) = |NL/o(&)| = 8 shows that the prime factorization 
of (€) is into prime ideals whose norms are powers of two. Problem 28 shows that 
all such ideals have been identified, and thus (€) = Poop? oP ; for some exponents 
> 0. Comparing norms shows that a +b+c =3. Similar remarks apply to (7). 

In (d), use of Problem 28 shows that Foorrorn 1 = ((2)R)* = ()R = (€)(n) = 
Foo Par Pe Thena+a = 2,b+ 8 =2,andc+y = 2 by unique factorization. 

For (e), we observe from the kernels, or else we see from Problem 29a, that € is not 
in P; 9 and that 7 is notin Po,;. Hence P) ¢ does not appear in the prime factorization of 
(€), and Po,; does not appear in the prime factorization of (7). Therefore b = y = 0. 

For (f), the results of (e) and (d) combine to show that a + a = 2, 6 = 2, and 
c=2. Sincea+c=3anda+f6 =3,a=a=1. 

31. For (a), we see immediately from Problem 29a that € + / lies in Po but 
not in Poo and not in Po,;. For (b), the formula |NK/o(é + 3)| = 2? shows that 


Chapter VI 687 


(€ + 3) is the product of exactly two of the prime ideals of norm 2; thus (a) implies 
that (€ + 3) = P?o. Similarly |Nx/g(€ — 1)| = 2°, and (a) gives (€ — 1) = P?o. 
Conclusion (c) is immediate from Problem 29a. 

For (d), we have (2)R C (2,&); thus (2, &) is of the form Pieri ahsy with 
a+b+c <3. Since & is not in Pi,.9, b = 0. Since & is in Poo and Po,;, we must 
have a > O andc > 0. Since the inclusion (2)R C (2, €) is proper (because é is not 
in (2)R = 2Z({1, é, n})), N((2, &)) < 4. Thus a = c = 1, and (2, €) = Poo Pot. 

For (e), Problem 29a shows that Po; = (2,&,7 + 1). Thus Pri contains 4 and 
E(y7+ 1) = 4+6, hence &. If ae contains also € + / with ] = 2 mod 4, then it 
contains € + 2, hence 2. This would mean that Lae > (2, &) = Poo Po.1. Since Pe 
and Poo Po,1 both have norm 4, they would have to be equal, and we would obtain 
Po,1 = Po.o, contradiction. 

For (f), Problem 30b gives N((é + 2)) = 8. In view of (c), (§ + 2) = Poort 
with a+c=3andc => 1. Part (d) shows that c < 1. Thus (€ + 2) = yaiane The 
argument for (€ — 2) is similar. 


32. For (a), this kind of argument is done in a parenthetical remark at the end of 
the solution of Problem 19. For (b), we have (£ +2) = 79 P0.1 and ( —1) = ae a= 
(€ + 3) P10. Thus the same kind of argument shows that Po,; and P},9 are principal. 

For (c), we factor X 34 x? 2X +8 modulo 3; there is no root in F3, and hence 
the reduced polynomial is irreducible. By Theorem 5.6 the only prime ideal whose 
norm is a power of 3 has norm 3°. 

For (d), we factor X? + X* —2X +8 modulo 5 as (X + 1)(X? — 2), and Theorem 
5.6 gives us one prime ideal of norm 5 and one of norm 5%. The one of norm 5, 
according to the theorem, is (2, 1 + &). For (e), the technique of Problem 30a shows 
that N((1+é&)) = 10. Thus the only possibility for the prime factorization of (1 + €) 
is as (2, 1+ &)P, where P is one of the three ideals of norm 2. For (f), since (1 + &) 
and P are principal, (2, 1 + &) is principal, by the same technique as in earlier parts. 

For (g), the prime factorization of nonzero ideals allows us to conclude that every 
nonzero ideal of norm < 6 is principal. Application of the technique after Theorem 
5.21 shows that every ideal class has a representative with norm < 6.35, hence norm 
< 6. All such ideals are principal, and therefore R is a principal ideal domain. 


Chapter VI 


1. Apply the Cauchy criterion. Since |a, + dy41 +---+4m lp < Maxp<k<m |aK Ip» 
the series is Cauchy, hence convergent, if and only if the terms tend to 0. 

2. In (a), the equality GCD(3, 2”) = 1 implies that there exist integers x, and yy, 
such that 3x, — 2”y, = 1. Then x, — ; = 2"3-ly,. Applying the 2-adic absolute 
value gives |x, — $l = 2™"lyn|, < 2~", and this tends to 0. For example take 
xn = z(22"-! + 1). In (b), the argument with ¢ replacing ; is similar: to get 


688 Hints for Solutions of Problems 


|x — $|, < 2~”, start by finding x and y with bx — 2”y =a. 

3. Write ideles as tuples indexed by ov, 2,3,5,.... If g is in Q, then v(g) = 
(q,9,9;9,.--.). If this is to be in R* x II, Z,,, then the only restriction on the first 
coordinate is that q # 0, but the other coordinates are restricted by |q|,, = 1 for all 
primes p. This means that g in lowest terms has no p in either the numerator or the 
denominator. So g = +1. This proves (a). 


In (b), let (09, X2, X3,...) bein I. Since IXplp # | for only finitely many p, there 
1 


exists a unique positive rational q such that |g|,, = |xp|, forall p. Define zp = xpq™ 


as a member of Q;. Then |zp|, = Ixplplalp = | shows that |zp|,, = 1 for all p. 
Finally define r = x.)q~! as a member of R*. Then (r, z2, z3, ...) is in I(Syo), and 
(X00, X2,X3,---) = (G,959,---), 22) 235 +++) 

4. In (a), the norm of the ideal divides the norm of any element, and if the 
norm of the ideal is prime, then the ideal is prime. With K = Q(./—5), we have 
Nxjgdtv7-5) = 6, Nx/Q(3) = 9, and Nx /Q(2) = 4. Therefore N(A+/-5, 3)) 
divides GCD(6, 9) = 3, and N((1 + /—5, 2)) divides GCD(6, 4) = 2. One checks 
that these ideals are not all of R, and then the respective norms are 3 and 2. So 
the ideals are prime. In (b), (1 + /—5) = (1+ /—5, 2)(1 + /—5, 3), and (3) = 
(1+/-5, 3)(1 — /—5, 3). 

In(c), }1+V—5)R = (1+V—5, 2) +5, 3) +/-5, 3) 1-5, 3)! 
= (1+ /—5, 2)(1 — /—5, 3)~!, and (1 + /—5, 3) does not appear. 

In (d), ay = 20a) 5s iD eG) Ds en 

: (1+/=5)(1-/=5) 1-/-5 

5. The mapping g : 1+ P? > P”/P"*! induced by 1+x t+ x + P”t! is 
a homomorphism from 1 + P” under multiplication into P”/P”*! under addition 
because the equalities g(1 +x) =x+ P?t!, o(1+y)=y+ P"*!, and 


g(1+x)d+y)) =e +x+y4xy) 
=xty+txy+ pr =xty+ pr! 


show that g(( +x) + y)) = g(1+x)+g¢(U.+y). The kernel of ¢ is the set of 
alll +x with x € P”*+!,ie., 1+ P”*!, and the image is certainly all of P?/P”*!. 

6. The composition I! /«(K*) — I/i(K*) — Z/P induced by the inclusion 
I! —> Land the passage from I to Z discussed in Section 10 is onto Z/P because the 
composition is affected by only the nonarchimedean places and because any member 
of I can be adjusted at the archimedean places so as to be in I'. In addition, the 
composition is continuous if Z/P is given the discrete topology. Since I! /1(K *) is 
compact, the discrete space Z/P has to be compact and must be finite. 


7. Fix a finite subset S of places containing Soo. Then the projection of [],,<5 Kis 
to K; is continuous foreach v € S. Since also the inclusion K;* — K, is continuous, 
the composition [],,,<5 K;s — Ky is continuous. Thus the corresponding mapping 
Tlwes Ki > TIwes Kw is continuous. In similar fashion [],,¢5 Z7, > Zy is a 


Chapter VI 689 


continuous function as a composition of continuous functions. Thus | ,,, es Zu > 
Tle gs Zw is continuous. Putting these two compositions together shows that 
Ix(S) — Ax(S) is continuous, and therefore Ix (S) — Ax is continuous. Since 
this is true for each S, it follows that Ix — Ax is continuous. 


8. Each x, lies in Ag(Soo), which is an open set in Ag. For each prime p, Xp,) = | 
if n is large enough, and also Xn,o90 = 1 for all n. Since Ag(Soo) has the product 
topology, {x,} converges to (1). On the other hand, if {x,} were to converge to some 
limit x in Ig, then x would have to lie in some I[(S), and the ideles x, would have to 
be in ICS) for large n. But (x,,,) is not in 1(S) as soon as v is outside S. 


9. For fixed g in G, we have d(®(gx)) = d(®(g)P(x)) = d(®(x)), and hence 
d(®(-)) andd(-) are Haar measures on G. Any two Haar measures are proportional, 
and the result follows. 


10. In (a) the equality is trivial if cjcp = 0. When cjc2 4 0, we have d(cjc2x) = 
|c1c2|- dx and also d(cic2x) = |e1|-d(c2x) = |ci|plc2|,- dx, and it follows that 
lc1¢2| ¢ = |c1|-|€2| , in this case as well. 

The proof of continuity is harder (but is essential to make sense out of (b)). We first 
check continuity ateach co # 0. Let f be a continuous real-valued function vanishing 
off a compact set S, and let N be a compact neighborhood of cg not containing 0. If c 
isin N, then f (c~!x) is nonzero only for x in the compact set NS. Lete > 0 be given. 
Continuity of (c, x) +> f (c7!x) allows us to find, for each x in NS, an open subneigh- 
borhood NV, of co and an open neighborhood U, of x such that | f (c~!y)— f (co 1y)I < 
€ forc € Ny and y € Uy. Then | f(c7! y) — f(co'y)I < 2€ forc € Ny and y € Ux. 
The open sets U, cover NS. Forming a finite subcover and intersecting the cor- 
responding finitely many sets N,, we obtain an open neighborhood N’ of co such 
that |f(e7!y) — f(y) < 2e€ forc € N’ whenever y is in NS. As a result, 
Ch He f(c7!x) dx is continuous at c = cy. Therefore c + |cly ty f(x) dx is 
continuous at co, and so is c F |cly. 

To prove continuity at c = 0, we are to show that limo /j vi (c—!x) dx =0 for 
f as above. Let U be any compact neighborhood of 0 in V. Find a sufficiently small 
neighborhood WN of 0 in V such that c € V implies that cS does not meet U. Then 
c-'U° NS = @. For such c’s, we have | fy, f(c7!x)dx| = ke fe dx = 
I F Il oa (dx(U)), and the desired limit relation follows. 

For (b), we have d(cx)/lex|r = (Ielr dx)/(Iclrlxlr) = ax/lx|r. For (c), lxle = 
|x| if F = R, and |x|r = |x|? if F = C. For (d), |x|p = |x|p if F = Q,. For (e), 
we have J = pZp, and therefore the Haar measure of J is the product of |p|) = wo 
times the Haar measure of Z,,. Hence the Haar measure of J is po: 

11. If F has characteristic p’ 4 0, then the sum 1 + --- + 1 with p’ terms is 0 
in R, and it must be 0 in R/p. So R/p must have characteristic p’. Thus any such 
p’ £0 must be p. 

12. In (a), apply Corollary 6.29 with f(X) = X7~! — 1 in R[X]. Every nonzero 
@ is a simple root of the reduced polynomial f(X) = X97! — 1 in F,[X], simple 


690 Hints for Solutions of Problems 


because (q — 1)(@)?~! 4 0. The corollary produces a root a of f(X) whose image 
in R/p is a. In this way we obtain g — | distinct roots of 1 in R, each corresponding 
to a different coset in R/p. Together with 0, these exhaust the cosets of R/p. 

In (b), if F has characteristic p, then raising to the p™ power is a field mapping 
of F into itself. Since g = p’", raising to the q™ power is the m-fold iterate of 
a field map and is a field map. If a and b are two (q — 1)* roots of 1 in R, then 
(a+b)4 = at + (4b)? =a+ (+b), andsoa +b isa (q — 1)" root of 1. Since the 
nonzero elements of E are closed under inverses, E is a subfield. 


13. In (a) let x be in R. Problem 12 produces a unique ap € E with x —apg inp, 1.e., 
with v(x — ag) > 1. Then ole — ag)) > 0, and Problem 12 produces a unique 
qa, in E with tla — do) — a; inp. Continuing in this way, we obtain ao, ..., @y in 
E with 

t 1016+" @ — a9) — a1) — +++) — aw-1) — an 


in p. Thus v(x — ia at’) > N +1. Since F is complete, )°?2.9 axt* converges 
with sum x. The statement about the value of v is clear. 

In (b), the part about the series giving an element in R is immediate from Problem 1, 
since r* has limit 0. The operations on R now match those on F, [[t]], and the isomor- 
phism follows. For (c), let x be given withx ¢ R. Set v(x) = —N. Then v(t’ x) =0, 
and we can apply (a) to writer’ x = panes t*. Thenx = bp aery): tk-N as required. 

14. In (a), the inclusion of the integers into R, followed by passage to the quotient 
R/p, is an additive homomorphism. Since R/p has order g, g must map to the 0 
coset, namely p. 

Part (a) shows that v(g) > 1. Since v(qg) = v(p”) = mv(p), v(p) is positive, 
and (b) is proved. The same argument as in the proof of Ostrowski’s Theorem shows 
that v(p’) = 0 for all prime numbers other than p, and then (c) is immediate. For 
(d), it is enough to check equality of the absolute values in question on the element 
p, and for that we have peo = gq VP)/(m) — go lim = pl, 

For (e), the map of Q’ to Q, when composed with the completion Q > Q,, isa 
homomorphism of valued fields into a complete field. It therefore extends uniquely 
as a homomorphism of the closure @Q into Q,. The dense set Q’ maps to the dense 
set Q, and hence the extended map is an isomorphism. 

Part (f) is just a repetition of the argument in Problems 13a and 13c. In (g), let 
x = 2 axt* be the expansion of f, and put cj, = pale agt,. Since v(t) = 1, we 
obtain v(x — cj,) = v(t’) = vou(t) = vo. Therefore v(p (x — Cjy)) = O. Iterating 
this procedure as in Problem 13a, we obtain a convergent expansionx = )-p-. cj, De 
For (h), we then have x = )-72)c;, pk = 3 Cj tklLie=i} p*, and we see that x 
lies in aa Oc. Therefore dim[F : Q’] <1. 

15. Part (a) is immediate, and (b) follows from Theorem 6.33. For (c), R/p 
corresponds to extracting the constant term from a power series in f, and thus L/g = 
Fs is of dimension f over R/p = Fy. The computation pT = tUT = tT = 
tRT = pT = P® shows that K/L has ramification index e. For (d), each index 


Chapter VI 691 


(residue class degree and ramification index) for K /F is the product of that index for 
K/L and that index for L/F. So e for L/F is 1, and f for K/L is 1. 


16. For (a), the irreducible polynomial g(X) has to be separable, and therefore all 
of its roots inkx are simple. Application of Hensel’s Lemma in the form of Corollary 
6.29 produces a. For (b), the polynomial g(X) is monic with coefficients in R, and 
its root a is therefore a member of L integral over R. Thus a lies in U. The natural 
field map U/go > T/P takesu+ tou-+ P, hence takesa + toa+ P =a. Thus 
we can regard @ as a member of ky. Since kr and @ generate kx by construction of 
a@,k, =krp. 

For (d), let us use subscripts on the indices e and f to indicate the field extension 
in question. Then we have ey/r frjr = [L : F] = degg(X) = degg(X) = 
[kx : kr] = fx /r on the one hand and fx;r = [kx : kr] = [ki : kr] = fir on 
the other hand. The two chains of equalities together show that e7/r = 1, and the 
second one in combination with fx;r = fx/i fir shows that fxs, = 1. 

17. In (a), the element y; exists and is unique because of the nondegeneracy of the 
trace form, which holds because K /F is separable (Theorem 8.54 and Section IX. 15 
of Basic Algebra). 

In (b), the expression for the z;’s in terms of the y;’s shows that ye Rzz SG 
aa Ry;. The assumption det A = +1 implies that B = A7! lies in M,(R). Since 
Yj = Ly BejzZe, we obtain Y";_) Ry; C Dopey Ree. 

For (c), itis evident that the degree isatmostn—1. Write g(X) = I] (xX —é;). The 
opening computations of Section V.4 show that g’(&;) = [zi (§& — E;). Therefore 
the value of the left side at & for the identity in question is 


. V4 (&& — &)) 
= Tie G - 6) 


The numerator is 0 unless i = k. Thus only the i" term makes a contribution, and its 
value, namely 1, matches the value of the right side. Then (d) is a routine computation. 

For (e), the rational expression (1 + c)X +--+: + C,X")—! on the left side is 
expanded in series using (1 + Z)~! = 1—- Z + Z? — Z> +---. Thus the left side 
is the sum of X” and a series beginning with a multiple of X”*+!. The right side is 
>i Tre /F (s’ (E)~'ek&x ret) , and the conclusion of the problem results by equating 
the indicated coefficients. 

For (f), the result of (e) handles the entries with i + j < n+ 1. For those with 
n+2 <i+j <2n, we write &'+/~79/(E)—! as E"E'+/—-"—-2 9 (E)—!, substitute for E” 
recursively from the field polynomial, and check that the traces are in R by applying 
(e). Thus all Aj; are in R. 

For (g), conclusion (f) shows that A is triangular with 1’s on the off diagonal, and 
hence the determinant of A is £1. Put z, = pat Ajxyj. Since x; = et 


Tr jr (ZeXi) = D0; Aje TrK je (yjxi) = Aik 
= Trx/r((g/ (&) 1A! )€'!) = Tree jp((g/(€) EF!) x3). 


692 Hints for Solutions of Problems 


Therefore z, = g/(€)~'é*-!. Combining this equality with (b) shows that N= 
Do Ryj = Ly Ree = Ly RBG) NET = (IN. 

18. For (a), the assumption f = n makes dimx, (kx) = n. Thus deg g(X) = 
deg g(X) = n. Since g(X) is irreducible, so is g(X). The root a of g(X) in K is 
such that F (a) is an n-dimensional subspace of K,, hence equals K 

For (b), the conclusion N =) T follows from the definition. Since F = D(K/F)~ Mm 
we obtain D(K/F)~! C N= g’(a)!N C g(a)" !T. 

For (c), the polynomial g(X) was constructed as irreducible, and g(X) was con- 
structed to reduce to g(X). Then 2’ (@) 0, and it follows that g’(@) is in T but not 
P. Thus g/(a) is a unit in T, and g/(w)"!T = T. Then D(K/F)7! C T. Since 
D(K/F)~! DT also, D(K/F)~! = T, and D(K/F) = 

19. For (a), we may assume that v(x|) < v(x;) for j > 1. If v(Qx1) < v(x) 
for all 7 > 1, then induction and use of property (vi) of discrete valuations shows 
inductively that v(O) = v(x, +--+ +Xm) = v(%1), contradiction. 

For (b), the element z is in 7, and its minimal polynomial has coefficients in R 
because T is integral over R; in turn, the field polynomial is a power of the minimal 
polynomial. Since c; is in R, we have vx (cj) = nuF(c;), and therefore ux (c;) is 
divisible by n. 

For (c), apply (a) to the equality cow” +c;2"~! +---+¢, = 0 to produce indices 
i < j with v(cja"!) = v(cjn"/) and with v(cea"*) > v(cin"~) for all k. The 
equality involving i and j implies that j — i = ux (cj) — vx(cj). Fromi < j <n, 
we have n —i > 0. Thus v(cjx"~') > v(cj) > 0. By (b), v(cja"~!) > n. So 
v(cyn"—*) > n. 

In (d), the right side of the equality j — i = vx (cj) — vx (c;) 1s divisible by n, 
by (b), and the left side is between 1 and n. Hence the two sides equal n, and we 
conclude that i = 0 and j =n. Thus the equality says that n = ux (cy). Since Cy is 
in F and since ux = nur, vF(c,) = 1. Therefore c, is in p but not bo: The inequality 
UK (cy *) > n implies that ux (cx) > k. For 1 < k <n, this conclusion implies 
that vx (cx) > 1. Since cx is in F and since vx = nvuf, vr(cx) > O fork > 1. Thus 
ce isinp fork > 1. 

In (e), the irreducibility is immediate from the Eisenstein irreducibility criterion, R 
being a principal ideal domain. Since the field polynomial is a power of the minimal 
polynomial, the field polynomial equals the minimal polynomial. Then the degree of 
F(z) isn. Since F(z) is an n-dimensional subfield of the n-dimensional field K, 
K=F (a). 

Part (f) is proved in the same way as Problem 14g. For (g), the expansion can be 
rewritten as ) y= 4k Yk = Vimo Lo< jee Veit eit) = Lo<j<e™! (so72o eit iA’). 
The term in parentheses is the most general member of R, and the left side is the most 
general member of T. Thus (g) follows. 

In (h), conclusion (g) shows that N = ae, =0 Rak equals T, and Problem 17 with 
€ = 7 shows that N = g'(x)—'N. Thus D(K/F)~! = =T= g'()~!T. Multiplying 
by (g’(1))D(K /F), we obtain D(K /F) = (g’(z)). 


Chapter VII 693 


For (i), g(a) = en®@ 14+ yi Cn_-ekak—! = enx®-! + b. In each term of b, 
UK (kCn_k) > CUF(Cn—K) = e, and vx (wk!) = k—1. Thus vx(b) > e. Meanwhile, 
uk (er?!) = (e — 1) + ux (e). Thus vx (g/()) = min ((e — 1) + vk(e), vx (d)), 
and property (vi) of discrete valuations shows that equality holds if the two members 
(e — 1) + vx(e) and vx (b) of the minimum are unequal. If vx (e) = 0, then the 
members are unequal, and we obtain vx (g’(7)) = e — 1. Otherwise, we obtain 
vk (g/(t)) => e. We know that D(K/F) = (g/()) = P® 8’), and Lemma 6.47 
follows. 


Chapter VII 


1. If x and y are members of L purely inseparable over K, then x?" and y?" are 
in K for suitable e and e’. Without loss of generality, let e’ < e. Then x?" and y” 
are in K, and hence (x + y)?) = x? ty? are in K and so are (xy)? = x? yP* and 
(xy7!)P* = xP’ y-P* if y £0. Sox + y, xy, and xy7! are purely inseparable over 
K, the last of these if y 4 0. 

2. In view of Proposition 7.10, the given conditions imply that [K(a@) : K] = 
p°{K (a?) : K] and that X?" — w?* is irreducible over K(a?’) for every u > 0. 
Since a?" “ is a root of this polynomial within K (aw) for each uw < e, K(a) has a 
chain of subfields 

Ka’) S Ka?) S---S Ka?) S K@) 

in which the consecutive degrees of the extensions are all p. Let 6 be separable over 
K, and let K (a”’ ) be the first of these fields to contain 6. Arguing by contradiction, 
suppose thatr < e. Then 6 and ap generate K (a?) because [K (a”’ ) : K(aP y] 
is prime. The separability of 6 over K implies that 6 is separable over K (a? oe ), hence 
that K(a?’) is separable over K(a?™’), hence that w”” is separable over K(a?™'), 
Since (w?’)? lies in K(a?"'), a?’ is also purely inseparable over K(a?"). By 
Corollary 7.12, a?” lies in K (a ). This contradicts the fact that the above chain 
of subfields is strictly increasing. We conclude that r = e. Hence all elements 6 
separable over K lie in K (aP*). 


3. For suitable integers Rz, we form the tuple z = (Ra + aZ)q>1, using the 
realization of the inverse limit in Proposition 7.27. We have to specify the integers 
Ry. The condition for z to lie in Z, coming from the condition f,, 0 fp = fa whena 
divides b, works out to be that Ry — Rg is divisible by a whenever a divides b. After 
the integers R, have been defined for all a, it is enough to check that Rpg — Ra is 
divisible by a whenever p is prime. 

For n odd, define Rey = nk + 1, where k is the unique integer from 0 to 2° — 1 
such that nk + 1 is divisible by 2°. This k exists and is unique because —n has an 
inverse modulo 2°. One checks that Ro-+1, — Ren is divisible by 2° and by n, and 
that Roe pn — Roen is divisible by 2° and by n if p is an odd prime. The definition 
makes Rp = 0 and R, = | for every odd prime q, and therefore z is not of the form 
Zc for any integer c. 


694 Hints for Solutions of Problems 


4. The first part is immediate from Theorem 7.34. For the second part the group 
Gal(R/Q) is trivial. In fact, any member of Gal(R/Q) must fix Q and map squares in 
R to squares. It therefore respects the ordering. For any r € R, it fixes each rational 
less than r, and hence it fixes r. 


5. Use Ky = QU/P1,---,./Pn ), Where py is the nih prime, and Proposition 7.30 
to see that Gal(K /Q) is an infinite product of groups of order 2. (A problem at the 
end of Chapter IX of Basic Algebra can help with this step.) The open subgroups of 
index 2 correspond to quadratic extensions of Q, of which there are countably many. 
Since Gal(K /Q) has uncountably many subgroups of index 2, such a subgroup H 
exists that is not open. The field extension K /Q is normal, and thus Gal(K /Q) is a 
homomorphic image of Gal(Qaig/Q), say by a homomorphism gy. Then yg! (ff) is 
the required subgroup of Gal(Qajg/Q). 


6. Suppose / is primary. If b + J is a zero divisor in R/J, then ab is in J for some 
anotin J. Since J is primary, b” is in J for some m. Thus (b+ J)” =b"+J=1T, 
and b + J is nilpotent in R/J. 

If every zero divisor in R/TJ is nilpotent, then the ideal 0 in R/T is primary because 
whenever (a + [)(b+ J) = I anda+J £1, then the nilpotence of b + J implies 
that b” + I = 1 for some m. This says that the 0 ideal 0 + J in R/T is primary. 

If the 0 ideal in R/J is primary and if ab is in J witha notin /, then (a+J)(b+/) = 
I witha+J/ ¢ J, and hence (b+ /)” = I for some m, 0 being primary in R/J. This 
means that b” is in J, and / is primary. 


7. In (a), if xy is in VT, then (xy) is in I for some m, and therefore either x” is 
in I or y”” is in J for some n, ie., either x is in VT or y is in JT. 

In (b), let x be in 7, and choose n such that x” is in J. Then x” is in J because 
I Cc J. Since J is prime, some factor of x” is in J, i.e., x isin J. 


8. In (b), R/I = C[y]/(y7). The zero divisors of R/I are cy with c € C, and 
(cy)? = 0 in R shows that cy is nilpotent in R. By Problem 6, J is primary. The 
radical P = /T is (x, y) by inspection, and this is prime. Since P? = (x, xy, y’), 
we have P? ¢ I & P. If I = Q" for some prime ideal Q, then J C Q, and 
Problem 7b shows that /7 € Q. Since VT is maximal in this case, Q has to be P. 

In (c), R/P = K[X, Y, Z]/(XY - Z*,X,Z) ~ K[Y], and this is an integral 
domain. Hence P is prime. Next, Pp? = (x?, XZ, Ze): Thus xy = 2 lies in P?. 
However, x is not in P”, and y” is not in P? for any m > 0. So P? is not primary. 

9. Let a and b be in R with ab in J and a not in J. To show that / is primary, 
we are to show that b is in /7. We do this by showing that (b) + 7 © JT. The 
ideal (b) + I is proper, since otherwise | = cb + x with x € J, which implies that 
a =cba+xa is in I, contradiction. Let J be a maximal ideal with (b) +7 C J. 
It is enough to show that ALE Cc J; in fact, then VI = J because \/T is assumed 
maximal, and (b) + J C VT as asserted. So let u be in /7. Then u” is in J C J for 
some m, and u is in J because J is prime. 

This proves the first part. The second part follows from the observation that if J 


Chapter VIII 695 


is maximal, then /J” = J. In fact, J” contains all elements a” fora € J. So /J” 
has to contain all elements a € J. Since J is maximal and JJ” has to be proper, 
SFP =F. 

10. In (a), let P be a prime ideal, and suppose that P = 7M J nontrivially. If i is 
in J but not J and if j is in J but not J, then ij is in P, but i is not in P because i is 
not in J and similarly j is not in P because j is not in J. 

In (b), 17 = (x7, xy, y”) is primary by Problem 9. The equality of J* with 
(Rx + I7)M (Ry + I’) holds by inspection. 

11. Arguing by contradiction, we can use the Noetherian property to obtain an 
ideal J maximal with respect to the property of not being a finite intersection of proper 
irreducible ideals. Since / is not irreducible, 7 = AN B nontrivially. By maximality, 
A and B are intersections, and then so is J, contradiction. 


12. Let Q be a proper irreducible ideal in R. Then 0 is a proper irreducible 
ideal in R/Q. We show that 0 is primary in R/Q, and then Problem 6 shows that 
Q is primary. Thus let xy = 0 in R/Q with y # 0 in R/Q. We want to see 
that some power of x is 0 in R/Q. In R/Q, we form the sequence of annihilators 
Ann(x) © Ann(x”) C --- and use the Noetherian property of R and its quotient R/Q 
to obtain Ann(x!) = Ann(x!t!) for some /. Let us see that the intersection (x!) N (y) 
isO in R/Q. In fact, if a is in (y), then xy = O implies ax = 0, and if a is in (x‘), then 
a = bx! and0 = ax = bx'*!, from which we see that b is in Ann(x!+!) = Ann(x’). 
Therefore a = bx! =OinR /Q. Thus indeed (x! yA) = 0. Since 0 is irreducible 
in R/Q and (y) # 0, we conclude that (x!) = 0 and x! = 0 in R/Q. This is what 
we were to show. 


13. If ab is in Q and a is not in Q, then ab is in Q; for all i and a is not in Q;, 
for some io. Since Qj, is primary, b” is in Q;, for some m, i.e., b is in ./Q;, = P. 
Since /Q; = P for all i, b* is in Q; for some k; depending on i. Taking N to be 
the maximum of the integers k;, we see that bY is in each Q; and hence is in their 
intersection Q. Thus Q is primary. 

Problem 7b shows that ./O © P. On the other hand, if b is in P, we have just 
seen that some power b” lies in Q. So b lies in /Q. Therefore /O = P. 

14. Problem 11 shows that every ideal is the finite intersection of proper irreducible 
ideals, and Problem 12 shows that these are primary. Thus if J is given, we have 
I = ()Q; with each Q; primary. Group all Q;’s whose associated prime ideal is 
the same P;, and denote the intersection of these by Qi. The ideal oF is primary 
by Problem 13. Then J = () Q’, and the Q’ have distinct associated prime ideals. 
So condition (ii) is satisfied. Finally among all expressions for J as intersections 
satisfying (ii), choose one that involves the smallest number of primary ideals. This 
minimality forces (i) to hold. 


Chapter VIII 


L@™!-D/q@-D=14+q4+qt---+q". 


696 Hints for Solutions of Problems 


3. It is enough to consider a monomial F(X,,...,Xn) = X%---X% with 
Sa a; =d. Then X; XM... XO) = XM. X%, and the sum on j equals 
AX% ...X%, 

4. If f' and g‘ have a nontrivial common factor in B[X], then 0 = R(f', g') = 
UR(f, g)). Since ¢ is one-one, R(f, g) = 0. Therefore f and g have a nontrivial 
common factor in A[X]. 

5. Let us show that if g, 4 0 and f,, = 0, then Theorem 8.1 for indices (m — 1, n) 
implies the theorem for indices (m,n), and vice versa. Assume for the moment that 
m > 2. Let Rf, g) be the resultant matrix of size m +n that takes into account all 
coefficients fo,..., fm of f, and let R(f, g) be its determinant. With f,, = 0, let 
R'(f, g) be the resultant matrix of size m-+n— 1, and let R’(f, g) be its determinant. 
The matrix R’ (f, g) is obtained by erasing the m™ row and last column of R(f, g). On 
the other hand, the only nonzero entry in the last column of R(f, g) is gn. Expansion 
in cofactors therefore gives R’'(f, g) = gnR(f, g). The hypotheses of Theorem 8.1 
apply to f and g for either of these resultants, and we have just seen that the two 
conditions (c) are equivalent. Certainly the two conditions (a) are equivalent. For the 
two conditions (b), the resultant of size m +n — | tells us that a’ f + b’g = R'(f, g) 
with dega’ <n and degb’ < m — 1. Certainly this implies that af + bg = R(f, g) 
with a = a’g, and b = b’g,. Conversely if af + bg = R(f, g) with dega <n and 
degb < m, we define a’ = ag7! and b’ = bg_!. Thena’ f +b'g = R'(f, g) with 
dega’ <n, and we need to see that deg b’ = degb < m — 1. Since fi, = 0, all the 
powers of X inaf are < (n — 1) + (m — 1), and the same must be true in bg. Since 
g has degree n, we must have deg b < m — 2 < m — 1, as required. 

Next we check what happens when m = | and we are comparing the resultant of 
size n + | and a degenerate resultant whose matrix is of size n and contains only the 
entries of g. The determinant formula is still valid, and we see that R’(f, g) = gj. 
which is nonzero. Thus (a) and (c) are false for both sizes. For (b), we cannot have 
af + bg =0 with degb <0 and b £0. We need to check that af + bg = 0 cannot 
happen with dega < n and degb < 1; in fact, then degbg = degg = n, while 
fi = 0 implies that degaf < n+deg f =n. So we cannot have af + bg = Oin 
this case either. 

The result of these calculations is that Theorem 8.1 for (m, n) is equivalent to the 
theorem for (m—1, n) if g, A Oand fin = 0. Using induction, we see that the theorem 
for (m, n) is equivalent to the theorem for (k, n) if gn A Oand fx41 =--- = fin = 0. 
Taking k = deg f gives the desired result. 


CG) 
ax; ( 


6. Proof via Nullstellensatz: Since f is irreducible and K[X,,..., X,] is a unique 
factorization domain, the principal ideal (f) is prime. Corollary 7.2 shows that g lies 
in (f): hence g = hf for some h. 

Proof via resultants: The idea is to arrange to have 


af + bg = R(f, 8), (*) 
with the resultant taken with respect to X,. Proposition 8.1 shows that this happens 
if f and g are of positive degree in X,,, and we shall show that either this is the case 


Chapter VIII 697 


or else f divides g for easy reasons. Since f is nonconstant, it depends nontrivially 
on some X;, and renumbering the variables allows us to assume that f depends 
nontrivially on X,. Then f is of the form 


S(X1,..., Xn) 
S06 Xiywisg Apt) ae Oso rae Kg a ee ass Spee 


with r > 0 and with c, nonzero in K[X,,..., Xn_1]. If g = 0, then certainly f 
divides g. So we may assume that g # 0. Choose ay, ...,@,—, in K such that 


8(A1,---,An—1, Xn)er (A1, -.-, An—-1) FO. (#*) 


Then f (a1, ...,@n—1, Xn) is a polynomial in X, whose coefficient of X}, is nonzero. 
Since K is algebraically closed, this polynomial in X, has a root, say dy. Since 
f(a, .--,4,) = 0, the hypothesis shows that g(a1,...,@n—1,4n) = 0, and (**) 


allows us to conclude that g = g(X,,..., X,) depends nontrivially on X,. This 
proves (+). 
To complete the proof, we show that c; R is 0 at every point (b;,..., b-—1). Since 


K is infinite, it will follow that the polynomial c,R is 0; thus R = 0 because c, 
is not the 0 polynomial. Then f and g will have a nontrivial common factor by 
Proposition 8.1, and f will have to divide g because f is prime. Thus suppose that 
cr(b1,..., bp-1) #0. Then f(b, ..., 6-1, Xn) is anonconstant polynomial in X, 
and must have a root b,, since K is algebraically closed. Hence f(bj,..., b-) = 0, 
and the hypothesis on g shows that g(b),..., b) =0. By («), R(by,..., by-1) = 0. 
This completes the proof. 


Te YS OXY OY aA SY = IY Lig OY =i: 
8. The resultant matrix in the W variable is 
xy*-y>® —2x2y2 x3 0 
0 Xy*—y> —2x?y? x3 


y* y3 —-x? 0 : 
0 y+ y> x? 


and its determinant is —X*¥°(Y — 2X)*. Substituting into either of the equations 
F =O and G = 0 gives the projective solutions (x, y, w) equal to (1, 0, 0), (0, 0, 1), 
and (1,2,4+ 4/2), up to nonzero scalar factors. (One has to check that both the 
equations F = 0 and G = O are satisfied.) 


9. Introduce a new indeterminate T = Y; — Z;, and remove Y;. Then R(F, G) = 
R(Y,...,T + Z;,...,¥m, Z1,..., Zn) is a polynomial in T, the Z;’s, and all the 
Y’s except for Y;. Also, R(F, G) = 0 when T is set equal to 0. Hence R(F, G) is 
divisible by T. Then (a) and (b) follow. For (c), the polynomials Y; — Z; are distinct 
primes. Since each divides R(F, G), their product must divide. Their product has 
the same degree as R(F’, G), and the result follows. 


698 Hints for Solutions of Problems 


10. We may assume that K is algebraically closed and that f is monic, say with 
f(X) = [TFL (X — &) and f'(X) =m a (X — nj). Then the previous problem 
gives f/(&) =m ]]¥5' & — nj), and 


R(f, f) = m™ Cn m-1 I] (& — nj) = m” Cnm-1 ut f'&) 
i,j i= 


with Cj ,m—1 equal to the constant c from Problem 9c when n = m — 1. According to 
Section V.4, the product is (— 1)"“"—D/2 times the discriminant D( f) of f. So the 
result follows. 

11. Replace G by G(X, Y, W) — (X? + Y*)F(X, Y, W) to get YWH(X, Y, W), 
where H(X, Y, W) = (X2 + Y”)(X? — 3Y”) —4X°YW. Then 


1(P, FAG) =1(P, FAYWH) =1(P, FAY) +1(P, FAW) +1(P, FO). 


For [(P, F NY), we use the method of Section 4, looking at F(t, 0, 1), which is t*; 
thus /(P, FO Y) =4. Since P is noton W,/(P, FAW) =0. 

For [(P, FM H), replace H by H(X, Y, W) — F(X, Y, W) to get YJ(X, Y, W), 
where J(X, Y, W) = —4X°Y —4Y¥? —7X*W+4+ Y°W. Then 


I(P, FN H)=1(P, FAYS) =1(P, FAY) +1(P, FOS), 


and again /(P, F 1 Y) = 4. If the local expressions of F and J are denoted by f 
and j, then their lowest-order terms /3(x, y) and j2(x, y) are given by 


fa(x, y) = 3x?y — y? = y(V3x + y)(V3x —y), 
jo(x, y) = —7x? + y? = -(V7 x + y)(V7 x — y). 


Thus F and J have no tangent lines in common at P, and /(P, FO J) =3-2=6. 
Collecting the results, we find that 7(P, FOG) =4+4+6= 14. 


12. Let P = [xo, yo, Wo], and choose ® € GL(3, K) with ®(xo, yo, wo) = 
(0,0, 1). The local versions of G and L are g(X,Y) = G(e!(x, Y, 1)) and 
I(X, Y) = L(®~!(X, Y, 1)). The expansion of g as a sum of homogeneous poly- 
nomials is g = gm +---+ gga because m = mp(G) > 0, and / is of the form 
1(X, Y) = aX +bY because P lies on L. We can parametrize / by y(t) = (bt, —at), 
and then the definition of intersection multiplicity is that J(P, L M G) is the least 
integer k such that the expression g,(g(t)) = tk gp (b, —a) is nonzero. The defi- 
nition of tangent line is any projective line L; whose local version /; is one of the 
factors of gm(X,Y) = c[]; (@iX + BY)”. Then gn(g(t)) = t"gm(b,-—a) = 
c[ |; (aib— Bia)’. If (a, b) is a multiple of some (@;, B;), then gm(y(t)) = 0; hence 
I(P, LOG) >m-+1. Otherwise gn(g(t)) 4 0, and /(P, LAG) =m. 


Chapter VIII 699 


13. The linear span LT(/) of the members LT(f) for f in J is amonomial ideal and 
is of the form (M), ..., Mx) for suitable monomials Mj; each of the form LM( f;) for 
some f; in J. Then {f|,..., f;} is a subset of J such that (LT(f1), ing LT(fx)) -_ 
LTC), and {f1,..., fx} is a Grobner basis of J by definition. 

14. If a, B, y are vectors of exponents in monomials such that the first i with 
w) -a Aw”. Bhas w -a > w” - B, then it equally true that the first i with 
wO-(aty) Aw .(B+y)hasw -(at+y) > w” - (B+ y). This proves that 
property (i) of monomial orderings holds with no further conditions on the weights. 
Property (ii) says for each vector a of nonnegative exponents not all 0 that the first i 
with w? -w #Ohas w -a@ > 0. Applying this condition as a necessary condition 
to the j" standard basis vector a = e ;, we See that the first i such that we % 0 must 


have wi > 0 for (ii) to hold. On the other hand, if this condition holds for all 7, then 
a suitable positive linear combination of these conditions gives (ii) for any a. 

15. In (a), a > a’ implies that X°-" > X > Y” for all b! > 0. Multiplying 
by X” gives X¢ > X“Y"", Since Y? > 1 implies X7Y’ > X“, we conclude that 
x¢y> = xy?’ for all b and b’. For a = a’, we observe that b > b’ implies that 
y>- & 1 and hence that Y? > Y®. Multiplying by X@ gives XY? > xey®, 
Hence the ordering is lexicographic. 

In (b), we observe that an inequality between X“ and Y? implies the same inequality 
between X”“ and Y”’. Consequently the particular inequality for X“ and Y? depends 
only on the rational number a/b. The assumption for (b) is that X < Y%, hence that 
X¢ < YW < y* if qa < b, thus if a/b < Gr: Thus the set S of rationals a/b 
such that X° > Y? is bounded below by q~!. Let r~! be the greatest lower bound 
of S. We know then that a <r7!, hence that r < q. So0 <r < oo, andrisa 
well-defined real number. 

Suppose that w/v < r~!. Then w/v is not in S, and so X“ < Y”. In the reverse 
direction, suppose that u/v > r~!. Then there is some rational c/d in S$ with 
u/v > c/d = r—!; this has X° > Y4. Then XW > xX’ = Y*4, Since d > 0, 
X" < ¥” would imply X“4 < Y°@, which is false. Thus we must have X" > Y?. 
This proves (b). 

For (c), the only rational u/v for which the inequality between X“ and Y¢ is not 
decided is u/v = r—!, and that only if r is rational. In this case a single weight vector 
will decide the correct inequality. All other inequalities between monomials follow 
from these. In fact, what needs deciding is the inequality between X“Y? and X@ Y” 
when a > a’ and b < b’, and this is the same as the inequality between X¢—“ and 
eae 

16. The formulas for f are a matter of computation. Both satisfy the conditions 
of Proposition 8.20 because LM(f) = X?Y is > each of LM((X + Yfp)= XAY, 
LM(1 fo) = Y?, LM(Xf}) = X7Y, and LM((X + 1) fo) = XY? and because no term 
of r; or r2 is divisible by LM(f,) = XY or LM(f2) = Y?. 

17. In (a), we check that {X* +cXY, XY} is a Grobner basis using Theorem 8.23. 
The leading monomials of the two generators are X” and X Y, and neither divides the 


700 Hints for Solutions of Problems 


other. Since the leading coefficients are 1, this Grébner basis is minimal. 

In (b) whenc 4 0, X 2 4+ ¢XY has a nonzero term whose monomial is divisible by 
the leading monomial of another generator; specifically the term cX Y in X* + cXY 
is divisible by the XY from the other generator. Following the procedure in Theorem 
8.28, we find that {X 2 XY } is the reduced Grébner basis. 


18. If (cy, ..., Cn) lies in Vx (J), then c; is one of finitely many roots of P;(X), 


for each j. Hence |Vx (1)| < []j—, deg Pj. 


19. Fix j, and choose a polynomial Q; in X that vanishes at the j™ coordinate of 
every member of Vx (1). Then P;(X,..., Xn) = Q;(X;) is a polynomial vanishing 
on Vx (J), and the Nullstellensatz shows that some power of it is in 7. The result is a 
polynomial in X; alone, as required. 

20. If Vx (J) is a finite set, then Problem 19 shows that J contains a nonconstant 
polynomial in X; for each j. The leading monomial for the j™ such polynomial 
has to be a power of X;, and it lies in LT(/). Conversely suppose that a power 


ca lies in LT(/) for each 7. Form a reduced Grébner basis of 7. Since the only 


monomials dividing X - are powers of Xj;, there exist members g; of the Grébner 
basis for 1 < j <n such that 


j-1 
8 (X1,-.., Xn) = XG) + XGF aj +++ + Xjaj +aj0 


for suitable polynomials Gj,mj—1, +++» 4,0 in Xj+41,..., Xn. Then Vx (J) is contained 
in Vx ((g1,---,; n)), and any member (cj, ..., Cy) of the latter has the property for 
each j that c; is aroot of a polynomial of degree m; in one variable, once (cj+1,..-, Cn) 
is fixed. Thus Vx (7) is contained in a finite set and has to be finite. 


ae: i, are given as in K(X), and we look for solutions 
of F(T,,..., T,) = 0. Clearing fractions in the coefficients, we see that it is enough 
to find a solution when each qj,,...,;,, has denominator 1. 


For (b), substitution of T; = ne bij X J, where each bj; is an unknown in K, 
into the equation F(7,..., T,) = 0 gives 


We expand this out and set the coefficient of each power of X equal to 0. The largest 
possible power of X that can appear is the sum of the largest power of X in any j,,....i,,. 
namely 5, and )-y_, Nix. Since F is homogeneous of degree d, )~7_ ix = d. Thus 
the largest possible power of X is Nd + 5. We get one equation for each power of X 


that appears, and the unknowns are the various b;;’s. 

22. The number of equations is < Nd +6 + 1, since the powers of X go from 0 to 
at most Nd +6. The number of unknowns is one for each index i with 1 <i < n and 
each possible power of X from 0 to N, hence exactly (N + 1)n. For N sufficiently 
large we want to see that Vd +6+1 < (N+ 1)n. Since d < n, the inequality in 
question is 6 + 1 —n < N(n — ad), and this is satisfied by taking N large enough. 


Chapter IX 701 


23. In the context of Problem 22, we have a homogeneous system with more 
unknowns than equations (for large N). If the number of unknowns is n + 1| and 
the number of equations is m, then we are looking for solutions in PZ. Since the 
inequality m < n is satisfied, the quoted theorem applies and produces a nonzero 
solution for the b;;’s. 


Chapter IX 


1. For (a), we argue by contradiction. Suppose that ci (x), ..., C, (x) are members 
of k(x), not all 0, such that a cj(x)t; = 0. Clearing fractions, we may assume that 
each c;(x) lies in k[x]. If necessary, we can divide through by a power of x and 
arrange that some c;(x), say cj,(x), has a nonzero constant term. The element x is 
by assumption transcendental over k. Applying the substitution homomorphism of 
k[x] into k given by evaluation at 0 yields )* j cjO)t; = 0. By the assumed linear 
independence of t), ..., tf, over k, c;(0) = 0 for all j. This contradicts the fact that 
cj(0) # 0. Then (b) is immediate. For (c), we know that [IF : k(x)] < o, and 
therefore [k’(x) : k(x)] < oo. By (b), [k’ : k] < co. 


2. This is immediate from Proposition 7.15. Alternatively, here is a direct proof. 
We may assume that the characteristic is p. It is enough to prove that if K is perfect 
and L is a finite extension, then L is perfect. Arguing by contradiction, we may 
assume that [L : K’] is as small as possible among all counterexamples. The image 
M of L under x + x? is a subfield of L, and M contains K because K is perfect. 
We cannot have M = L, since L is assumed not to be perfect. By construction of 
L, M is perfect. Composing x +> x? from L into M with x +> x!/? from M into 
itself, we obtain a field map of L onto M that fixes M. The result is a one-one M 
linear transformation of the finite-dimensional M vector space L onto a proper vector 
subspace, contradiction. 


3. Let F be a function field in one variable over k. Since k is perfect, Theorem 
7.20 shows that F is separably generated. Let us write F = k(x, ..., X,). Theorem 
7.18 shows that there is some x; such that F is a separable extension of k(x;). If we 
write x for x;, then the Theorem of the Primitive Element shows that F = k(x)[y] 
for some y algebraic over k(x). Put R = k[x]Ly] = k[x, y]; the field of fractions of 
R is F. Let g(x, Y) be the minimal polynomial of y over k(x). If d(x) is a common 
denominator for the coefficients of g(x, Y), thend(x) 4 0 because x is transcendental 
over k. If we set f(X, Y) = d(X)g(X, Y), then f(x, y) = 0. Hence the substitution 
homomorphism k[X, Y] — R given by replacing X by x and Y by y factors through 
to a homomorphism ¢ carrying k[X, Y]/(f (X, Y)) onto R. The ring R is an integral 
domain; hence the ideal (f(X, Y)) is prime, and f(X, Y) is irreducible. We can find 
an ideal J in k[X, Y] containing (f (X, Y)) such that g descends to an isomorphism 
of k[X, Y]/I onto R. This ideal / has to be prime, and we let J be a maximal ideal 


702 Hints for Solutions of Problems 


of k[X, Y] containing it. Then we have a chain of inclusions of prime ideals 
OS(XY) CIC. 


Theorem 7.22 shows that kLX, Y] has Krull dimension 2, and it follows that either 
(f (X, Y)) = IJ in the above chain of inclusions, or J = J. The latter equality would 
mean that J is maximal and therefore that R = kX, Y]/J is a field; this is not the 
case, and thus (f(X, Y))=/. Hence R = k[X, Y]/(f(X, Y)). Here f(X, Y) is an 
affine plane curve irreducible over k, and the field of fractions of R is by definition 
the function field of the curve; this field is F, and the argument is complete. 

4. The singular points are common zeros of f, a, and a If there are infinitely 


many, then Bezout’s Theorem says that f and a have a nontrivial common factor, 


and so do f and a Since f is irreducible and the partial derivatives reduce degrees 


in one or the other variable, we must have of = af = 0 as polynomials. This is 
impossible in characteristic 0. In characteristic p, the first condition says that the 
only powers of X that appear in f are powers of X?, and the second condition says 
that the only powers of Y that appear are powers of Y?. The coefficients of f are 
powers of p because k is assumed perfect, and thus f is exhibited as a p" power, in 


contradiction to its assumed irreducibility. 

5. Differentiate f(X,b) = (X — a)fi(X) and evaluate at (a,b) to obtain 
ax (a,b) = fila) + (a—a)f{@ = fila). 

6. Multiply the equation g(X, b) = (X—a)g1(X) by f; (X) and substitute to obtain 
8(X, b) fi(X) = F(X, b)gi(X). Then the function g(X, -) fi(X) — F(X, -)gi(X) 
is O at b and is of the form g(X, Y) fi(X) — f(X, Yygu(X) = (Y — b)hy (X,Y), 
where h,(X, Y) for each X is a polynomial in Y. Since (Y — b)h,(X, Y) is equal toa 
polynomial in (X, Y), 41 (X, Y) is a polynomial in (X, Y). To complete the problem, 
evaluate both sides at (x, y), and use the facts that f(x, y) = 0 and that f;(x) 4 0. 

7. Since F = k(x, y) is a function field in one variable, it is enough to see that y is 
transcendental over k. Arguing by contradiction, suppose that there is some nonzero 
polynomial c(Y) in k[Y] having y as a root. As a polynomial in k[X, Y], c(Y) maps 
to c(y) = 0 when we pass to the quotient in kX, Y]/(f(X, Y)), and therefore c(Y) 
is the product of f(X, Y) by a polynomial. On the other hand, og is not 0, and 
thus f(X, Y) depends nontrivially on X. Hence the product of f(X, Y) and any 
nonzero polynomial in (X, Y) depends nontrivially on X, contradiction. The result 
now follows from the observation at the end of Section 1. 


8. Substituting a for x in the formula for g(x, y) gives 


g(a, y) = (y— baka, y)/fi@*. 


In this formula, hx (a, y) is a polynomial expression in y, hence also in y — b. Thus 
v1; is > 0 on it. The expression fi (a)* is a nonzero member of k, on which v, takes 
the value 0. Therefore 


u1(g@, y)) = kui (y — b) + vi (hk, y)) = kur (y — 8). 


Chapter IX 703 


The left side is independent of k, and the right side is unbounded in k. Therefore 
there is some upper bound to the values of k for which g(x, y) has an expansion of 
the kind in question. 


9. For (a), we cannot have h; (a, b) = 0 in Problem 8 for arbitrarily large k because 
of the bound found in Problem 8. If k = n is the smallest k for which hx (a, b) £0, 
then the displayed formula holds with h = h,. For uniqueness we substitute a for x 
and see that g(a, y) = pn(y)(y—b)" fora polynomial py, with p,(b) 4 0. We cannot 
have two such expressions involving distinct powers n because y is transcendental 
over k. 

For (b), we see from (a) that every nonzero member of R is of the required form 
with n > 0. Since F is the field of fractions of R, the same thing is true for F as long 
as we allow n to be arbitrary in Z. 

For (c), if we have two such expressions, we set them equal, clear fractions, and 
write the result as (y—b)* p(x, y) = q(x, y) forsomek > Oand for some polynomials 
p and q with p(a, b) 4 Oand q(a, b) £0. Substituting (a, b) for (x, y), we obtain 0 
from (y — b)* p(x, y) unless k = 0, and we obtain something nonzero from q(x, y). 
Therefore k = 0, and the required uniqueness follows. 


10. From the definition we immediately have v(g) = +00 if and only if g = 0, 
as well as u(gg’) = v(g) + v(g’) for all g and g’. We are to show that u(g + g’) = 
min(v(g), v(g’)). Thus write g(x, y) = (y — b)"hi (x, y)/h2(x, y) and g'(x, y) = 
(y — bh) (x, y)/h5 (x, y) with n < m. Then min(v(g), v(g’)) = min(n, m) = n. 
Also, 

Ayhy+(y—b)" "hah 
gts = (py et 
The numerator of the displayed fraction is a polynomial and can be written in the 
form of Problem 9a. Say that (y — b)* is the power of (y — b) that appears in it, 
k being > 0. Then v(g + g’) =n +k, and this is > n = min(v(g), v(g’)). The 
assertions about the valuation ring and the valuation ideal are clear. 

11. Let v’ be a second valuation having the stated properties. If g(x, y) is given 
in F*, decompose g as in Problem 9b, and apply v’. Then we obtain v'(g(x, y)) = 
nu'(y — b) + uv'(hAy(x, y)) — v'(ho(x, y)). The assumptions on v’ show that 
v' (Ai (x, y)) = v'(ha(x, y)) = 0. Therefore 


v (g(x, y)) =nv (y — b) =v (y — b)v(g(x, y)), 


and v’ = v’(y — b)v. By assumption, v’(y — b) is positive. Since v’ has to be onto 
ZU {oo}, we must have v’(y — b) = 1. 

12. For (a), the argument is the same as with Problem 7 except that the roles of 
x and y are reversed. The partial derivative ao" fa) = 2y is not the 0 element 
because the characteristic is not 2, and hence that earlier argument applies. Part (b) 
is elementary field theory, and (d) is a routine verification. 

For (c), let k’ be the subfield of elements of F algebraic over k. Problem | shows 
that [k’ : k] < [k’(x) : k@)] < [F : k] = 2. Arguing by contradiction, suppose 


704 Hints for Solutions of Problems 


that {1, t} is a basis of k’ over k. let X? + uX + v be the minimal polynomial of 
t over k; t satisfies 7? + ut + v = 0. Problem la shows that t = a(x) + yb(x) 
with b(x) & 0, and then f satisfies t?7 — 2a(x)t + (a(x)* — f(x)b(x)*) = 0. Hence 
ut +v = —2a(x)t + (a(x)? — f(x)b(x)*). If u 4 —2a(x), then we can solve 
for t and obtain the contradiction that ¢ is in k(x). Thus u = —2a(x), and also 
v =a(x)?— f (x)b(x)?. Since x is transcendental over k, the first of these shows that 
a(x) does not involve x, i.e., a(x) lies in k. Then the second shows that f (x)b(x)? 
lies in k, and unique factorization leads to the conclusion that f(x) and b(x) do not 
depend on x. This contradicts the assumption that f (X) is nonconstant. 


13. Let z = a(x) + yb(x) be in the integral closure. Then so is the image of z 
under the nontrivial Galois group element o, and so are z + o(z) and zo(z). The 
latter elements are 2a(x) and a(x)? — f (x)b(x)*. Thus a(x) is in the intersection of 
the integral closure with k(x), which is k[x] because k[x] is a principal ideal domain 
and is integrally closed. Then f (x)b(x)? is in k[x] by the same argument. Since 
J (x) is square free, it follows that b(x) is in k[x]. 


14. Part (a) is immediate from Corollary 6.6. Discrete valuations of F that are not 
in Dr play no role because of the inclusion k C R: any discrete valuation that is > 0 
on R has to be 0 on kk”, since the image of k* under the valuation is a subgroup of Z. 

For (b), the condition for z 4 0 to be in p(X)oo is that v(z) > —p ordy(*)oo for 
all v € Dg. Ifa particular v has v(x) > 0, then v does not contribute to (7)oo, and 
this condition says that v(z) > 0. By (a), zis in R. 


15. For (a), let c(x) = cyx" +- toy = X" (Cy 4 Cnp_1x7) +++ + egx7”) with 
Cn #0. Then v(c,) = 0, and u(cjx/~) > 0 for j <n. Hence 


1 1 


fore t cox—")) =nv(x) + v(cn + enix +++ tox”) 


=nv(x) + v(cy) = nv(x). 


v(x" (Cn + Cn—1X— 


For (b), 2u(y) = v(y) = v(f (x)) = (deg f)v(x), the latter equality holding by (a). 
In (c), we have 


v(a(x) + yb(x)) = min (v(a(x)), v(y(x))) 
= min (v(a(x)), v(y) + vb@))) 
= min ((dega)v(x), (5 deg f + deg b)v(x)) 
= v(x) max (dega, 1 deg f + deg b) > pv(x). 


16. Any v € Dg with v(x) > 0 has v(z) > 0 = —ordy(X)oo on all elements 
Z = a(x) + yb(x) with a(x) and b(x) in k[x], by Problems 13 and 14a. Suppose 
that v(x) < 0. Then Problem 15c and the assumptions on the degrees of a(x) and 
b(x) shows that v(z) > pu(x) = —p ordy(*)oo. Hence (z) > —p(x)oo, and z lies in 
L(P(X)eo). 


Chapter IX 705 


18. For (a), let o be the nontrivial element of the Galois group. Problem 17c 
shows that if z = a(x) + yb(x) is in L(p(%)oo), then so is o(z) = a(x) — yb(x). 
Hence any v € Dg with v(x) < 0 has v(a(x) + yb(x)) => —pordy(X)oo = pu(x) 
and u(a(x) — yb(x)) = —pordy(x)oo = pu(x). Consequently 


v(a(x)) = v2a(x)) = min (v(a(x) + yb(x)), v(a(x) — yb@))) 
> min (pv(x), p(x) = pu) 


and 


v(a(x)? — fx)b(x)*) = v(a(x) + ybOx)) + v(a(x) = yb(x)) = pu(x) + pula). 
Using Problem 15a and the fact that v(x) < 0, we see from these two inequalities 
that dega < p and deg(a* — fb?) < 2p. 

For (b), Problem 14b shows that L(p(x)oo) © R, and Problem 13 shows that 
R consists of all a(x) + yb(x) with a(x) and b(x) in k[x]. Part (a) thus shows that 
dega < panddeg(a*— fb*) < 2p. Sincedega < p, the second of these inequalities 
shows that deg fb? < 2p. Thus degb + 5 deg f < p. In the reverse direction, if 
a(x) and b(x) are polynomials satisfying the degree relations, then Problem 16 shows 
that a(x) + yb(x) is in L(p(%)oo). 

19. The polynomials a(x) and b(x) are limited only by the restrictions on their 
degrees. From deg a < p, we get a space of dimension p+ 1. Fromdeg b+ 5 deg f < 
p, we have deg b < [p = 5 deg car and we get a space of dimension [p - 5 deg f| +1 
if [p — 5 deg f] = 0. Thus 


(p(x)oo) = (p + 1) + [p — 5 deg f] +1 
=2p+2+ [ - deg f] =2p +2-[5(1 +deg f)] 


if p > —[— 5 deg f] = +[5(1 + deg f]. 

20. Part (a) is immediate from Theorem 9.3, since [F : k(x)] = 2. For (b), Theo- 
rem 9.9 and Problem 19, in combination with the result of (a), show for sufficiently 
large positive p that 


1 — g = £(p(x)oo) — pdeg(x)oo = 2p +2 —[45(1 + deg f] — 2p. 


Hence g = [5d + deg f] —1. 

21. Let ® : k(X)[Y] — k(X)[Z] be the substitution homomorphism that fixes 
k(X) and has ®(Y) = g(X)Z, and follow it with the quotient homomorphism to 
k(X)[Z]/(Z* — h(X)). Then 


@(Y? — f(X)) = g(X)?Z? — f(X) = g(X)(Z? — h(X)), 
which goes to 0 in the quotient. Thus the composition of ® followed by the quotient 
map descends to a field map y : k(X)[Y]/(¥? — f(X)) > k(X)[Z]/(Z? — h(X)). 


The inverse is constructed in the same way, starting from the formula ¥(Z) = 
g(X)'Y. 


706 Hints for Solutions of Problems 


22. For (a), the conclusion genus | when there are no repeated roots is immediate 
from Problem 20b with deg f = 3. If there are repeated roots, then we can write 
f(X) = g(X)°A(X) with deg g = degh = 1. Applying Problem 21, we see that the 
genus is the same as for Problem 20b with deg f = 1, 1.e., the genus is 0. 

For (b), a singularity occurs only at points (x, y) of the zero locus in kee at which 
both first partials are 0. Then 2Y = 0, which says that y = 0 because the characteristic 
is not 2, and f’(X) = 0, which says that x is a root in Kaig of both f(X) and f’(X). 
This means that x is at least a double root in kKgjg of f(X). 


23. The residue class degree f; is 1, since k is algebraically closed. Thus degnv = 
n. Corollary 9.4 gives £(Ov) = 1, Corollaries 9.22 and 9.23 together give €(1v) = 1 
if g > 1, and Corollary 9.19 gives €((2g — 1)v) = deg((2g — 1)v) + - g) = 
Qg—1)+CU—-g) =gand Qgv) = deg2Qgv) + (1 — g) = g +1. The inequality 
(nv) < €(n+ lv) < £(nv) + 1 follows by combining Theorem 9.6, the fact that 
A < B implies L(A) C L(B), and the fact that f, = 1. 


24. For eachn > 0, 


L(nv) = {0} U {x € F* | —(x)oo = —nv} = {0} U fx € F® | (Xoo < nv}. 


Thus n > | is a gap if and only if €(nv) = €((m — 1)v), and otherwise €(nv) = 
£((n — 1)v) + 1 by the last fact in Problem 23. 

Suppose that there are m gaps in passing from £(Ov) to €(2gv). In the process we 
take 2g steps from (n — 1)v to nv, of which m are gaps and 2g — m are nongaps. (The 
gaps are certain of these integersn, 1 <n < 2g.) Since €(Ov) = land &(2gv) = g+1 
by Problem 23, the total number of nongaps is (g + 1) — 1 = g. Solving 2g —m = g 
gives m = g. The formulas ((2g — 1)v) = g and €(2gv) = g + 1 from Problem 23 
show that 2g is not a gap. 


25. For (a), if the gap sequence is (1,2,...,g), then | = €(v) = (lv) = 
£(Qv) = --- = €(gv). Conversely if the gap sequence is something else, let n with 
1 <n < gbethe first nongap; then 1 = €(0v) = --- = €((n—1)v) < L(nv) < (gv). 

For (b), Problem 23 gives €(Ov) = €(1v) = 1 if g > 1, and thus | is a gap. 

For (c), there are no integers strictly between 0 and 2g if g = 1, and the only such 
integer for g = | is 1. Part (b) shows that the gap sequence is indeed (1) if g = 1, 
and thus the gap sequence is always the standard one. 

For (d), we have some x and y in F* with (x)o = rv and (y)o = sv. Thus 
(x) = (Xo — rv and (y) = (y)o — sv, and (xy) = (xo + (V0) — (F +5)v. Since v 
does not contribute to (x)o and (y)o, (Xy)oo = (F + 5)v, and thus r + 5 is a nongap. 

For (e), if 2 is a nongap, then iteration of (d) shows that 2,4, 6,...,2g — 2 are 
nongaps. The only possible gaps are the remaining integers from | to 2g — 1, namely 
1,3,5,...,2g — 1. There are g of these, and so all of them must be gaps. 


Chapter X 707 
Chapter X 


1. If F is in J(P), expand F as a sum of homogeneous terms F = )~0°o Fu. 
Then. 0 =F @20,425.3 in) Sy Pa O00) kn) = Fa Os on Mae for 
all t € k*. Since k is infinite, every coefficient of this polynomial in ¢ is 0. Thus 
each Fy is in I(P), and I(P) is generated by homogeneous elements. 


2. In each part we argue by contradiction. For (a), if {X,} is a system of nonempty 
closed subsets of X with the finite intersection property such that (|, Xq = ©, then 
we can inductively define a strictly decreasing sequence of finite intersections of the 
Xq’s, in contradiction to the Noetherian property. In (b), if E is a closed irreducible 
subset that is not connected, then E = U U V with U and V nonempty, disjoint, and 
relatively open. Then E = US U V* contradicts the irreducibility of E. 


3. For (a), the continuous image of a connected set is connected. Continuity is 
by Proposition 10.32, and connectedness is by Problem 2b applied to the Noetherian 
topological space V. For (b), if f is any polynomial function on A”, then f o @ is 
in O(V) because ¢ is a morphism, and f o g is constant by Corollary 10.31. Then 
y cannot have two distinct points in its image, since any two points in A” can be 
distinguished by some polynomial. 


4. Certainly O(U) D> k[X, Y]. Also, the function field k(U) consists of all 
quotients of polynomials a/b with a and b in k[X, Y] and b 4 0. Thus suppose that 
f = a/b lies in O(U). By unique factorization in k[X, Y], we may assume that a 
and D are relatively prime. In the expression f = a/b, regularity at P implies that 
b(P) 4 0 because an equality a/b = c/d of two such expressions implies that a = kc 
and b = kd for some nonzero scalar k. Since f is regular everywhere in A? except 
possibly at the origin, b(X, Y) is nonvanishing away from the origin. However, if 
b is nonconstant, then V(b) is a curve and has dimension 1, whereas the origin has 
dimension 0. We conclude that b is constant, and f = a/b is in k[X, Y]. 


5. Arguing by contradiction, let g : W — U be an isomorphism from an affine 
variety onto U. Then the map g : O(U) > O(W) = A(W) given by G(f) = fogis 
an isomorphism. Let: : U —> A* be the inclusion. The corresponding map on regular 
functions isT : A(A?) > O(U) given by T(h)(x, y) = h(x, y) for (x, y) (0,0), 
and it is an isomorphism by Problem 4. Then (gy 01)” = To @ is an isomorphism 
of A(A?) onto A(W). Its inverse has to be of the form v with W(g) = gow for 
some isomorphism y : A? — W, according to Theorem 10.38. Since v ogotis the 
identity map on A(A*), 0 go w is the identity map on A?. Using the definition of 1 
shows that go w(x, y) = (x, y) for (x, y) € (0, 0). Thus g 0 y is an isomorphism of 
A? onto U that is the identity on U. This is a contradiction, since there is no possible 
image for (0, 0) under g o w that makes g o wW one-one. 


6. Let gy be the rational map of the irreducible curve C into the irreducible curve 
C’, and let (E, gz) be a morphism in the class gy. If y is not dominant, then gg (E) 
is a proper closed subset of C’ and must be finite. Hence yg (EF) is finite. The set E 
is connected by Problem 2b, and morphisms are continuous by definition. Therefore 


708 Hints for Solutions of Problems 


¢r(E) is connected. Being connected and finite, it is a singleton set {y}. If gc is 
defined as everywhere equal to y on C, then (C, gc) is in the equivalence class g. So 
g is constant. 


7. Suppose that f is a member of Og(p)(V) with gp(f) = 0. Since the set on 
which f € k(V) is regular is open, there exists an open neighborhood E of g(P) on 
which f is defined. The morphism ¢ is continuous, and thus g~!(E) is open in U. 
Since y is a morphism and f is regular on E, f og is regular on g~!(E). According 
to the proof of Proposition 10.42, g(f) is defined to be the unique member of k(V) 
that agrees with f og ong !(E). We are assuming pp(f) to be 0, and thus f og 
equals O on yg”! (E). By dominance of ¢, g(g7! (E)) is a dense subset of E. Thus 
the continuous function f is 0 on a dense subset of its domain E and is 0. 


8. The inclusion (WX — YZ) C (X, Z) yields ahomomorphism ¢ of A(V) onto 
k[W, X, Y, Z]/(X, Z) = k[W, Y]. Let b! = g(b). Then b'(w, y) = b(w, 0, y, 0) 
is a polynomial in (w, y) nonzero in the complement of the origin. The solution 
of Problem 4 shows that b’(0,0) 4 0. Thus dO, 0,0,0) 4 0, and f is defined at 
(0, 0, 0, 0). In view of the discussion of this example in Section 4, f is everywhere 
defined. Therefore it is in O(V), which equals A(V) because V is an affine variety. 
Thus there is a polynomial g in k[W, X, Y, Z] whose image g in A(V) equals X/Y. 
Then Yg = X,andYg = X+ (WX —YZ)h for some polynomial h. So Y(g+hZ) = 
X(1+ Wh). This implies that Y divides | + Wh, which we see is impossible by 
evaluating at the origin. 


9. The equivalence of continuity of g and continuity of all gy will be taken as 
known. Suppose that g : U — V isamorphism. Let an index a, an open set E C Vy, 
and a member f of O(E) be given. We are to show that f o gy is in O(w;'(E)). 
Since ¢ is a morphism and E is open in V, we know that f og is in O(g~!(E)). By 
restriction, f o @y isin O(Uy Ng !(E)) = O(g,!(E)). Thus ¢, is a morphism. 

In the reverse direction suppose that all g, : Ug, — Vy are morphisms. Let E 
be open in V, and let f be in O(E). We are to show that f og is in O(y~!(E)). 
Since yg! (E) =U, Wan yg! (E)), itis enough to prove regularity of f og on each 
Uy 1 ¢7!(E). On this open set, f o g equals f o ¢y, which is regular because ¢, is 
a morphism. Thus ¢ is a morphism. 


10. For (a), we use the equivalence of regularity with the condition in Proposition 
10.28. Thus regularity at P in U means that there is a subneighborhood Up of U 
within V about P such that f equals a quotient @/b on Uo with a and b in A(V) and 
with b nowhere vanishing on Up. Choose polynomials a and b in k[X1, ..., X,] that 
restrict to @ and b on V. Let Uj be an open subset of A” whose intersection with V 
is Up. Since b is nowhere 0 on Up and is continuous on Up; the subset Uo of U; on 
which b is nonvanishing is open and contains Uo. Then Proposition 10.28 shows that 
F =a/b is amember of O(Uo) whose restriction to Up equals f. 

For (b), the result of (a) is local. Thus we can immediately allow V to be quasi- 
affine. Using Proposition 10.37, we can extend (a) to the case that V is quasiprojective. 


Chapter X 709 


11. Continuity is no problem. For the condition involving regularity, we use 
Problem 10. Let E be a relatively open set in V, and let f be in O(E). We are to 
show that f oy is in O(y~!(E)). Thus let P be in g~!(E) € U; then g(P) is in 
E CV. Since f is in O(E), Problem 10 produces a relatively open neighborhood Eo 
of g(P), an open subset Eo of Y with Eo OV = Eo, and a function F in O(Eo) such 
that Fl, = — hae Since g : X — Y isamorphism, F o g is in O(o-!(Eo)). Since 
v(o-! (Ep) NAU)C EG OV = Eo, F op agrees with f og on vy! (Eo) OU. Thus 
f og has an extension F o g from yo! (Eo) NU to y_'(Eo) that is in O(Eo). The 
quotients that exhibit F og as defined at points of yg! (E£9)NU exhibit fog as defined 
there. The inclusion g~!(Eo) = gy '(EoNV) = go (E9) Ng (V) Cc c go '(E9)NU 
shows that f o g is in Og! (E)). This being true for all P in g -1(B), fogisin 
O(p"'(E)). 

12. Part (a) follows by applying instances of Problem 11 to g and g~!. Then 
(b) follows by another application of Problem 11. Part (c) follows by inductive 
application of (b). 

13. Let d; be the degree of homogeneity of F;. Then the i" row of the right-hand 
matrix is A“'—! times the i” row of the left-hand matrix. Hence the dimension of the 
span of the rows is the same for the two matrices, and this number is the rank. 


14, This comes down to the fact that differentiating with respect to X; for j > Oand 
then setting Xo equal to | is the same as setting Xo equal to | and then Guerenuatiné 
with respect to X;. 


15. For any of the functions F;, the right side of the formula in Euler’s Theorem is 0 


at (xo, ..., Xn) by assumption. Hence Euler’s Theorem gives xo pe (XO, sMegkn) = 
= Via Xj 5x Xo, ...,X,). This says that 

n 
xo x 0" column of J(F)(x0, ---.Xn) =— xj x j™ column of J(F)(x0, ---, Xn). 


j=! 


Since x9 4 0, this is a relation of the required type. 


16. Problem 13 shows that the left side equals rank J(F)(1, x1/x0, .--, Xn/X0), 
which Problem 15 shows to be equal to the rank of the matrix formed from the last n 
columns, which Problem 14 shows to be equal to the rank of J(f)(%1/x0, ..-, Xn/X0). 


18. Regard the elements w;; as the entries of a matrix. The given condition is 
that every 2-by-2 subdeterminant of this matrix equals 0. The matrix is not 0, and 
consequently its rank is 1. Every matrix over k of rank 1 is of the form xy’ for column 
vectors x and y, and then [{wj;;}] is exhibited as o ([{xi}], [{y;}]). 


19. For (a), one suitable monomial ordering is the lexicographic ordering that 
takes the elements W;; in the order Woo, Wo1,..., Winn with Woo largest. Given a 
monomial M’ of total degree d, choose among all monomials of total degree d the 
smallest one in the ordering that is congruent to M’ modulo a. Write M = J; ; Wi 


710 Hints for Solutions of Problems 


Ifa;; > Oand if there exists (k,/) with] > j,k > i,andag; > O, then W;; Wx; divides 
M. Write Mo = M/W; Wi. Put M’ = Mo Wii Wx;. Since Wij Wii — Wii Wij is in 
a, M” is congruent to M modulo a. In the monomial ordering, all of the elements 
Wri, Wil, We; are smaller than W;;. Therefore M " < M, in contradiction to the 
minimality of M. 

In (b), let the largest W;; whose exponents in M and M ‘ are unequal be W,, jp: Let 
the products of the powers of the strictly larger monomials be N and N’, respectively. 
It is enough to prove that p(M/N) 4 y(M'/N’). Then we have 

M/N= [I W;,! = W,,0% Il W;,! 
Wij SW; ; (i,j) with , 
ig <i or 
(io=i and jo<j) 


040 


and a similar expression for M’/N’. The minimality condition says that aj; = 0 if 
io <iand jo < j. Thus 


M/N=( JI] Wii’) ( Le Wii’) = (Tesi, Teg jy Wil) (Tis jn Wie’). 


ig<i, jo=j ip=i, joSj 


7 a7 ai 
and 9(M/N) = (Tlksig Ths jg Xe" Yr) (Tis jn XY, "). 


On the right side each pair of indices (k,/) occurs at most once. Thus an equality 
o(M/N) = g(M'/N’) would imply that ag; = by for every (k, 1). This proves (b). 

In (c), we know that a C ker g. If equality fails, then there is a linear combination 
>~, ¢- M, of monomials in ker g that is not in a. Applying (a), we may assume that 
each M, is reduced. Then }°. c-g(M,) = 0. Each g(M,) is a monomial, and (b) 
shows that the various monomials y(M,) are distinct. Since the set of monomials is 
linearly independent, each c,; is 0. Therefore ae cM, = 0, contradiction. 


20. For (a), compute the kernel of the natural substitution homomorphism of 
k[Xo0,..., Xm, Yo,---, Yn] into R[Yo,..., Y,]. For (b), let P = [yo, y1,.--, yn], 
p = IU) C k[Xo,..., Xm], and q = I1({P} C k[¥o,...,¥,]. The inside 
homomorphism has kernel a by Problem 19. The outside homomorphism takes 
Xo,..., Xm into R and takes each Y; to y;Z, where Z is an indeterminate; its kernel 
is isomorphic to pq. The kernel of the composition is [(a (U x {P})), which is prime 
because R[Z] is an integral domain. 

21. See Fulton’s book, page 145. 

22. See Fulton’s book, page 146. 

23. For (a), Proposition 10.9 shows that /(V(7)) = (A(X, Y)) for an irreducible 
polynomial h if dimV(/) = 1. The containment J C /(V(J)) shows that each fj 
has to be of the form f; = ajh for some a; ink[X, Y]. Since f; and A are irreducible, 
aj has to be a scalar. Thus J = (h(X, Y)), and J is prime. For (b), one can take 
IT =(Y¥ + X2, Y — X?), which has V(/) = {(0, 0)} and which is not prime because 
it contains X? but not X. 


Chapter X 711 


24. Let {g1,..., gs} be a minimal Grébner basis, and suppose that gj = ab isa 
nontrivial factorization of g; ink[X1,..., Xn]. Since J is prime, we may assume that 
a lies in J. Then LM(g;) = LM(a) LM(b), and LM(q) lies in LT(/). Since {g1,..., gs} 
is a Grobner basis, LM(a) lies in the monomial ideal (LM(g1),..., LM(gs)). By 
Lemma 8.17, LM(g;) divides LM(a) for some i. It follows that LM(g;) divides LM(g;). 
Since the Grébner basis is minimal, i = j. That is, LM(g;) = LM(a) = LM(g;). 
Thus LM(b) = 1, in contradiction to the assumption that the factorization of g; is 
nontrivial. 


25. Identify ayy X2 + 2ajyoXY + annY2 + 2a13XZ + 2a23YZ + a33Z" with the 
symmetric matrix 
Q11 G12 413 
A= (: a22 ) ‘i 
413 23 433 
By the Principal Axis Theorem choose an invertible matrix M such that A’ = M'AM 


x 

is diagonal. Put ( vy }) = M7! (« ) and substitute. Then the given quadratic 
Zz! Z 

polynomial equals wX’* + BY’? + yZ’?, where a, B, y are the diagonal entries of 

A’. If aBy = 0, this is reducible; it is readily checked to be irreducible if aBy 4 0. 

Since wBy = det A’ = (det M)? det A, the reducible polynomials correspond to the 

affine hypersurface on which det A = 0. 

26. The first conclusion is a special case of Corollary 9.19. Then take x to be a 
nonconstant member of L(2vg), and take y to be a member of L(3vg) not in the 
linear span of {1,x}. Corollary 9.22 shows that (x), = 2, and then the equality 
(Y)oo = 3 follows from the definitions. 

27. These are special cases of Theorem 9.3. 

28. Since 2 = [k(E) : k(x)] = [k(Z) : k(@, y)] [k@, y) : k(@)], the integer 
[k(E) : k(x, y)] divides 2. The corresponding equality with 3 and k(y) shows that 
(k(E) : k(x, y)] divides 3. Therefore [k(£) : k(x, y)] = 1. 

29. The values of vo on the seven listed members of k(£) are 0, 2, 3, 4, 5, 6, 6, 
respectively. The members are all in L(6vg), which has dimension 6 by Problem 28, 
and thus the listed members are linearly dependent. If y? or x? does not contribute 
to this dependence, then vg takes distinct values on the remaining six members of 
L(6vg), and Problem 19a at the end of Chapter VI gives a contradiction. Hence the 
coefficients b and c of y* and x3, respectively, are nonzero. If x and y are replaced by 
—bcx and bc*y and if the linear combination of terms is then divided by b°c*, then 
the linear dependence takes the form (y? +ayxy+a3x)— (x3 +agx? +a4x +a6) = 0, 
as required. Hence g carries E — {0} into CN A?. 

30. Certainly f(X, Y) is not divisible by any nonconstant polynomial in X. Thus 
the only possible reducibility is of the form f(X, Y) = (VY + p(X))(V + q(X)). 
Expanding out the right side shows that 


P(X) + q(X) = aX +43, 
p(X)q(X) = —(X? + ay X* + a4X +46). 


712 Hints for Solutions of Problems 


The second equation shows that at least one of p(X) and q(X) has degree > 1, and 
then the first equation shows that deg p(X) = degq(X). But this equality would 
mean that deg p(X)q(X) is even, contradiction. Hence f (X, Y) is irreducible. 

31. The function g is a morphism of E — {O} into CN A? by Lemma 10.39, and 
the composition with Bo is a morphism into P?. Then g is a morphism of E — {O} 
into C by Problem 11. The class of (E — {O}, ¢) is therefore a rational map of E 
into C, and Corollary 10.54 shows that g extends to a morphism ® : E —> C. 

32. Let  : k(C) > k(E) be the field mapping that corresponds to ® under 
Theorem 10.45. The field k(C) is generated by the functions x9 and yo that pick 
out the coordinates of points of CN A’, and Theorem 10.45 shows that ®(x9) = 
(class of x9 o yg). For P. in E — {O}, this has (xo)(P) = = x0(9(P)) = x(P), ie., 
P(xo) = = x. Similarly (yo) = = y. Therefore O(k(C)) = = k(x, y). By Problem 28, 
® is onto k(E). By Corollary 10.46, ® is birational. 


33. The homogeneous polynomial of degree 3 from which f(X, Y) arises is 
F(X, ¥, W) = (Y°W +a XYW + a3YW’) — (X? +aX°W + a4XW? +a6W?). 


The points of C on the line at infinity arise by setting W = O and F(X, Y, W) = 
0 simultaneously, and the only such point is [0, 1,0]. Computation shows that 
af (0, 1,0) = 1. Consequently [0, 1, 0] is a nonsingular point of C. 
34. A point (xo, yo) in A? is a singular point of C if and only if f (xo, yo) = 
af (x0, yo) = af (x0, yo) = 0. At (x0, yo), computation shows that 
af _ arf _ af _ a 
os = —6X — 2a, axoy ~ 41> 9gyz = 2, axe —6. 


All higher-order derivatives are 0. Application of Taylor’s formula about (xo, yo) 
therefore gives 


f (X,Y) = (—3x0 — a2)(X — x0)” + a1(X — x0)(Y — yo) + (Y — yo)? — (X — x0)”. 
We put X = x and Y = y, taking into account that f(x, y) = 0. After division by 
(x — x0)’, the result is that 
((y — yo)(x — x0) ')? +.a1(y — yo) — x0)! = GBxo + a2) + (& — 20). 
That is, z7 + ajz = (3x9 +42) + (x — x9). Suppose that P is in E — {O} and that 
vp(z) < 0. Then we have vp(z + a,) < 0 and 
0 < vp((3x0 + a2) + (x — x0)) = vp(z? +.a1z) = vp(z) + vp(Z +a) < 0, 


contradiction. Therefore vp(z) > 0. Meanwhile, vo (x — x9) = vo(x) = —2 and 
vo(y — yo) = vo(y) = —3. Hence vo(z) = (—3) — (—2) = — 

35. Corollary 9.22 shows that no member of k(£) has the properties of z found 
in Problem 34. Thus C is nonsingular at every (xo, yo). In combination with Prob- 
lem 33, this shows that C is everywhere nonsingular. By Corollary 10.55, ® is an 
isomorphism. 


SELECTED REFERENCES 


Artin, E., Theory of Algebraic Numbers, notes by Gerhard Wiirges, George Striker, 
Gottingen, 1959. 

Artin, E., C. J. Nesbitt, and R. M. Thrall, Rings with Minimum Condition, University 
of Michigan Press, Ann Arbor, 1944. 

Atiyah, M. F., and I. G. Macdonald, Introduction to Commutative Algebra, Addison- 
Wesley Publishing Company, Reading, MA, 1969. 

Borevich, Z. I., and I. R. Shafarevich, Number Theory, Academic Press, New York, 
1966. 

Brieskorn, E., and H. Kno6rrer, Plane Algebraic Curves, Birkhauser, Basel, 1986. 

Brown, K. S., Cohmology of Groups, Springer-Verlag, New York, 1982. 

Buchberger, B. and F. Winkler (eds.), Grébner Bases and Applications, Cambridge 
University Press, Cambridge, 1998. 

Buell, D. A., Binary Quadratic Forms, Springer-Verlag, New York, 1989. 

Cartan, H., and S. Eilenberg, Homological Algebra, Princeton University Press, 
Princeton, 1956. 

Cassels, J. W. S., Rational Quadratic Forms, Academic Press, London, 1978. 

Cassels, J. W. S., and A. Frohlich (eds.), Algebraic Number Theory, Academic Press, 
London, 1967. 

Cohn, H., A Classical Invitation to Algebraic Numbers and Class Fields, Springer- 
Verlag, New York, 1978. 

Cox, D. A., Primes of the Form x* + ny”, John Wiley & Sons, New York, 1989. 

Cox, D., J. Little, and D. O’Shea, /deals, Varieties, and Algorithms, Springer-Verlag, 
New York, 1992. 

Dirichlet, P. G. L., Lectures on Number Theory, supplements by R. Dedekind, English 
translation of the original German, American Mathematical Society, Provi- 
dence, 1999. 

Dummit, D. S., and R. M. Foote, Abstract Algebra, Prentice Hall, Englewood Cliffs, 
NJ, 1991; second edition, Upper Saddle River, NJ, 1999; third edition, John 
Wiley & Sons, Hoboken, NJ, 2004. 

Eisenbud, D., Commutative Algebra with a View Toward Algebraic Geometry, 
Springer-Verlag, New York, 1995. 

Eisenbud, D., and J. Harris, The Geometry of Schemes, Springer-Verlag, New York, 
2000. 

Farb, B., and R. K. Dennis, Noncommutative Algebra, Springer-Verlag, New York, 
1993. 


713 


714 Selected References 


Farkas, H. M., and I. Kra, Riemann Surfaces, Springer-Verlag, New York, 1980; 
second edition, 1992. 

Freyd, P., Abelian Categories: An Introduction to the Theory of Functors, Harper and 
Row, New York, 1964. 

Frohlich, A., and M. J. Taylor, Algebraic Number Theory, Cambridge University 
Press, Cambridge, 1991. 

Fulton, W., Algebraic Curves: An Introduction to Algebraic Geometry, Addison- 
Wesley Publishing Company, Redwood City, CA, 1989; originally published 
by W. A. Benjamin, Inc., New York, 1969, and reprinted in 1974. 

Gauss, C. F., Disquisitiones Arithmeticae, English translation of the original Latin, 
Springer-Verlag, New York, 1986. 

Griffiths, P. A., Introduction to Algebraic Curves, American Mathematical Society, 
Providence, 1989. 

Gunning, R. C., Introduction to Holomorphic Functions of Several Variables, Vol- 
ume III: Homological Theory, Wadsworth &Brooks/Cole, Pacific Grove, CA, 
1990. 

Hall, M., The Theory of Groups, The Macmillan Company, New York, 1959. 

Hardy, G. H., and E. M. Wright, An Introduction to the Theory of Numbers, Clarendon 
Press, Oxford, 1938; second edition, 1945; third edition, 1954; fourth edition, 
1960; fifth edition, 1979. 

Harris, J., Algebraic Geometry: A First Course, Springer-Verlag, New York, 1992; 
reprinted with corrections, 1995. 

Hartshorne, R., Algebraic Geometry, Springer-Verlag, New York, 1977. 

Hasse, H., Number Theory, English translation of the original German, Springer- 
Verlag, Berlin, 1980; reprinted, 2002. 

Hecke, E., Lectures on the Theory of Algebraic Numbers, English translation of the 
original German, Springer-Verlag, New York, 1981. 

Hilton, P. J., and U. Stammbach, A Course in Homological Algebra, Springer-Verlag, 
New York, 1971; second edition, 1997. 

Hua, L.-K., Introduction to Number Theory, English translation of the original Chi- 
nese, Springer-Verlag, Berlin, 1982. 

Hungerford, T. W., Algebra, Holt, Rinehart and Winston, New York, 1974; reprinted, 
Springer-Verlag, New York, 1980; reprinted with corrections, 1996. 

Ireland, K., and M. Rosen A Classical Introduction to Modern Number Theory, 
Springer-Verlag, New York, 1982; second edition, 1990. 

Jacobson, N., Basic Algebra, Volume I, W. H. Freeman and Co., San Francisco, 
1974; second edition, New York, 1985. Volume II, W. H. Freeman and Co., 
San Francisco, 1980; second edition, New York, 1989. 

Jacobson, N., Lectures in Abstract Algebra, Volume I, D. Van Nostrand Company, 
Inc., Princeton, 1951; reprinted, Springer-Verlag, New York, 1975. Volume II, 
D. Van Nostrand Company, Inc., Princeton, 1953; reprinted, Springer-Verlag, 
New York, 1975. Volume III, D. Van Nostrand Company, Inc., Princeton, 
1964; reprinted, Springer-Verlag, New York, 1975. 


Selected References 715 


Jacobson, N., Structure of Rings, American Mathematical Society, Providence, 1956; 
revised edition, 1964. 

Jacobson, N., The Theory of Rings, American Mathematical Society, New York, 1943; 
multiple reprintings, American Mathematical Society, Providence. 

Knapp, A. W., Elliptic Curves, Princeton University Press, Princeton, 1992. 

Knapp, A. W., Basic Real Analysis, Birkhauser, Boston, 2005. 

Knapp, A. W., Advanced Real Analysis, Birkhauser, Boston, 2005. 

Knapp, A. W., Basic Algebra, Birkhauser, Boston, 2006; digital second edition, 2016. 

Knapp, A. W., and D. A. Vogan, Cohomological Induction and Unitary Representa- 
tions, Princeton University Press, Princeton, 1995. 

Lam, T. Y., A First Course in Noncommutative Rings, Springer-Verlag, New York, 
1991; second edition, 2001. 

Lang, S., Algebra, Addison-Wesley, Reading, MA, 1965; second edition, 1984; third 
edition, 1993; revised third edition, Springer, New York, 2002. 

Lang, S., Algebraic Number Theory, Addison-Wesley, Reading, MA, 1970; reprinted, 
Springer-Verlag, New York, 1986; second edition, 1994. 

Lang, S., Introduction to Algebraic and Abelian Functions, Addison-Wesley, Reading, 
MA, 1972; second edition, Springer-Verlag, New York, 1982; reprinted, 1995. 

Mac Lane, S., Categories for the Working Mathematician, Springer, New York, 1971; 
second edition, 1998. 

Macdonald, I. G., Algebraic Geometry: Introduction to Schemes, W. A. Benjamin, 
Inc., New York, 1968. 

Massey, W. S., Singular Homology Theory, Springer-Verlag, New York, 1980. 

Matsumura, H., Commutative Algebra, W. A. Benjamin, Inc., New York, 1970; second 
edition, Benjamin-Cummings, Reading, MA, 1980. 

Matsumura, H., Commutative Ring Theory, Cambridge University Press, Cambridge, 
1986; reprinted with corrections, 1989. 

Muir, T., The Theory of Determinants in the Historical Order of Development, 
Volume I, Macmillan, London, 1906; Volume II, 1911; Volume III, 1920; 
Volume IV, 1923; Volume V, 1929. Reprint of Volumes I-II, Dover Publica- 
tions, New York, 1960; reprint of Volumes III-IV, 1960. 

Mumford, D., The Red Book of Varieties and Schemes, Lecture Notes in Mathematics, 
Volume 1358, Springer-Verlag, Berlin, 1988; second expanded edition, 1999. 

Niven, I., and H. S. Zuckerman, An Introduction to the Theory of Numbers, John 
Wiley & Sons, New York, 1960; second edition, 1966; third edition, 1972; 
fourth edition, 1980; fifth edition, with H. L. Montgomery, 1991. 

Reid, M., Undergraduate Algebraic Geometry, Cambridge University Press, Cam- 
bridge, 1988. 

Rotman, J. J., Notes on Homological Algebras, Van Nostrand Reinhold Company, 
New York, 1970. 

Rudin, W., Principles of Mathematical Analysis, McGraw-Hill Book Company, New 
York, 1953; second edition, 1964; third edition 1976. 


716 Selected References 


St. Andrews, School of Mathematics and Statistics, University of St. Andrews, Scot- 
land, MacTutor History of Mathematics Archive, Biographies of Mathemati- 
cians, http://www-groups.dcs.st-and.ac.uk for background, 
http://www-history.mcs.st-and.ac.uk/history for entry 


point, http://www-history.mcs.st-and.ac.uk/history/ 
BiogIndex.htm1 for indices of biographies of mathematicians, updated 
as of 2015. 


Serre, J.-P., Algébre Locale, Multiplicités, second edition, Lecture Notes in Mathe- 
matics, Volume 11, Springer-Verlag, Berlin, 1965. 

Serre, J.-P., A Course in Arithmetic, Springer-Verlag, New York, 1973. 

Serre, J.-P., Local Fields, Springer-Verlag, New York, 1979. 

Shafarevich, I. R., Basic Algebraic Geometry, Springer-Verlag, Berlin, 1977; second 
edition, published as two volumes, 1994. 

Silverman, J. H., The Arithmetic of Elliptic Curves, Springer-Verlag, New York, 1986. 

Sturmfels, B., What is a Grébner basis?, Notices of the American Mathematical 
Society 52 (2005), 1199-1200. 

University of Sydney, Representation and monomial orders, Handbook of Magma 
Computer Algebra System, http://magma.maths.usyd.edu.au/ 
magma/handbook/text/1177, updated as of 2015. 

Ueno, K., An Introduction to Algebraic Geometry, American Mathematical Society, 
Providence, 1997. 

Van der Waerden, B. L., Modern Algebra, English translation of the original German, 
Volume I, Frederick Ungar Publishing Co., New York, 1949. Volume II, 
Frederick Ungar Publishing Co., New York, 1950. 

Villa Salvador, G. D., Topics in the Theory of Algebraic Function Fields, Birkhauser, 
Boston, 2006. 

Walker, R. J., Algebraic Curves, Princeton University Press, Princeton, 1950; 
reprinted, Dover Publications, New York, 1962; reprinted, Springer-Verlag, 
New York, 1978. 

Weil, A., Basic Number Theory, Springer-Verlag, New York, 1967; second edition, 
1973; third edition, 1974; reprinted, 1995. 

Weil, A., Number Theory: An Approach through History: From Hammurapi to 
Legendre, Birkhauser, Boston, 1984; reprinted, 2007. 

Zariski, O., and P. Samuel, Commutative Algebra, Volume I, D. Van Nostrand Co., 
Inc., Princeton, 1958; reprinted, Springer-Verlag, New York, 1975. Volume II, 
D. Van Nostrand Co., Inc., Princeton, 1960; reprinted, Springer-Verlag, New 
York, 1976. 


INDEX OF NOTATION 


This list indexes recurring symbols introduced in Chapters I through X (pages 
1-648). For recurring symbols introduced in Basic Algebra, see the list of 
Notation and Terminology on pages xxiii-xxvi. Some of the latter notation has 
been repeated here for the reader’s convenience. 

In the list below, each piece of notation is regarded as having a key symbol. 
The first group consists of those items for which the key symbol is a fixed Latin 
letter, and the items are arranged roughly alphabetically by that key symbol. The 
next group consists of those items for which the key symbol is a Greek letter. The 
final group consists of those items for which the key symbol is a variable or a 
nonletter, and these are arranged by type. To locate an item below, first proceed 
on the assumption that the key symbol is a Latin or Greek letter; if the item does 
not appear to be in the list, then treat it as if its key symbol is a variable or a letter. 


A, Ax, 389, 559 coker f, 175 

A", Ak, 455, 559 D(&), 279 

A(K, Gal(K/F), a), 137 D(K/F), 372 
Ar, 542, 543 Dr, 532, 549 
Ai, 542, 543 Dro, 534 

A(V), 579 Dx, 267 

A, 570 D(L), 267 

Aa, 570 Diff(F), 547 
A(V), 584 Div(w), 548 
A(V)a, 585 d_,, 194 

“0, 639 dn, 153, 154 
B(F), 126 X = {(Xn, dn} 4, 174 
B(K/F), 127 dim R, 424 

C, 330 Ext,(A, B), 223 
C(a), 620 éi, fir g, 275, 354 
Cr, 169 (€j,5 wae Cje)s 619 
C(V(a)), 633 extR(A, B), 223 
Cr, 532, 549 Fy([X]], 347 
Cro, 534 F,((X)), 347 
E°, complement, xxiii Fy, 346 

coimage f, 240 Fry, 437 


717 


718 Index of Notation 


fos 533 LCM(X®, X*), 501 
Gp, 368 Log, 289 
Gal(Fp/F,), 434 LM(f), LC(f), LT(S), 496 
Sciex’ SGreviex’ 494 En), 497 

8, 538 =rpx? 493 

2x, 538 L(A), 536 

H(s, a), 633 lim, 439 

Ha(s,a), 621, 626 M, 493, 620 
H(s,a), 633 Mp, 600 

H,(s, a), 625, 628 M,, 431 

Hj, 620 Mp, 600 

H,(X), 153, 172 m,, 431 

H"(X), 153, 174 mp(F), 474 

H,(X), 172 N(J), 39, 273 
H*(X), 174 Najr(-), 165 

H,(G, M), 209 Nx;r(-), norm, xxvi 
H"(G, M), 147 Nrdajr(-), 165 
Hompr(A, B), 169 O(U), 580, 582, 587, 641 
h(D), 7, 14 Op(U), 582, 587 
hx, 299 Op(V), 580, 585 

I, Ix, 390 R°, opposite ring, xxiv 
I', 390 ord,(A), 532 

T, 330, 393, 576 P’, 456 

T, 576 P", 457, 570 
I=(r,r2), 38 Py, 457 

I= (r,r,), 38 P, 330, 393 

I(E), 560 Pr, 532, 549 

I(P), 571 Py, 322, 533 

I(P, FOG), 474 Q,, 316, 318 
I(P,LOF), 467 Rf, g), 451 

image f, 240 R(f, g), 451 

J(&), 272 R(fi, F), 514 

K(S), 409 Ry, 346 

K(E), 412 Ry, 322, 533 

k, 528, 559 R,, 431 

k(V), 580, 585 Residue, 542 

k’, 531 Residue,(y), 541 
L(A), 544 r1, 72, 348, 383 
L(A), 535 rad A, 78 


Lis, x); 63 SCA, fo); 502 


Soo, 391 

S-!R, localization, xxvi 
Spec A, 639 

(Spec A, O), 641 
Tor®(A, B), 224 
Tra/F(-), 165 
Trxr(-), trace, Xxvi 
Trrd4/Fr(-), 165 

A’, transpose, xxiii 
tor’ (A, B), 224 
tr.deg R, 424 

V(C), Vx(C), 455-456 
VW), 429 

V(S), 559, 571 
V(fi, Ree fi); 559 
Ve, 532 

vp(-), 321 

Voo, 328 

X(S), 388 

X*, 494, 620 

xj(P), 559 

Z(L), 268 

Zp, 318 

Z, 437 


ZG, integral group ring, xxv 


Greek 

aj, 369 

Bo, 574 

Bo, 574 

Bi, 369, 575 

6(A), 543 

6;;, Kronecker 5, xxiii 
€, 149, 195 

ni, 369 

t, 390, 391 

o, 617, 646 
O1,..-,0n7, 288, 383 
Xo, 62 

w, 185, 547 


Index of Notation 719 


Functors given by subscripts 
and superscripts 

R*, units, xxiv 

Rp, localization, xxvi 

Xt, 194 

Kaig, algebraic closure, 434 
Ksep, separable algebraic closure, 434 
MG, invariants, 208 

Mg, coinvariants, 209 

M, dual fractional ideal, 372 
M,, 376 

L®, 460 


Specific functions 

a = (a |,...,Q@,), multi-index, 494 
|a|, 620 

(5) Legendre symbol, 8 

(“), Jacobi sysmbol, 68 

[K : F], degree, xxvi 

|= phe 

| - |, absolute value, 331 

| - |], norm, 356 

(x)o, )oo, 532 


Isolated symbols 

~, Brauer equivalent, 124 

~, homotopic, 154 

On, 153, 172 

d_;, 194 

I’. restricted direct product, 388 


Operations on sets and classes 
RG, group algebra, xxv 

J/1, radical, 405 

K[X1,..., Xnaila, 458 

Ass B, morphism, 235 


Miscellaneous 
(x), principal divisor, 532 
(Xi)ier, 388 


720 


I =(rj,r2), generated ideal, 38 
I= (r1, 72), 38 

[x, y, w], point in P*, 459 
[xo,---,Xn], pointin P”, 570 


Index of Notation 


yg = {(E, ¢z)}, rational map, 595 
X = {(Xn, On) }VP_,,, 171 


(F,| - |r), valued field, 342 
{O(U), pvy}, presheaf, 640 


INDEX 


Abel, 521 crossed-product, 137 
abelian category, 238 cyclic, 122, 162, 163 
abelian group generalized quaternion, 121 
divisible, 196 Lie, 77 
torsion, 169 semisimple associative, 80 
abelian Lie algebra, 78 semisimple Lie, 79 
absolute discriminant, 35, 267 simple associative, 80 
absolute norm of ideal, 39, 273 simple Lie, 79 
absolute value, 289, 331 solvable Lie, 78 
archimedean, 289 tensor product for, 104 
discrete, 338 Weyl, 85 
nontrivial, 332 algebra polynomial, 164 
normalized, 383, 384, 385, 386 algebraic closure, separable, 434 
of idele, 390 algebraic set 
trivial, 331 affine, 559 
acyclic resolution, 219 irreducible affine, 563 
additive category, 233 projective, 571 
additive functor, 170, 178 algebraically independent, 409 
adele, 389 aligned primitive forms, 25 
adjoint, 252 archimedean, 331, 333, 346 
affine algebraic set, 559 absolute value, 289 
dimension of, 566 place, 383 
irreducible, 563 valuation, 289 
affine coordinate ring, 579 Artin product formula, 387, 390, 395 
affine curve, irreducible, 529 Artin reciprocity, 265 
affine Hilbert function, 621, 626 Artin’s Theorem, 89 
affine Hilbert polynomial, 625, 628 Artinian ring, 87 
affine hypersurface, irreducible, 430, 562 associated prime ideal, 446 
affine local coordinates, 461 associated translation, 622 
affine n-space, 455, 559 associated vector subspace, 622 
affine plane curve, 455 associative algebra 
irreducible, 430, 524, 562 semisimple, 80 
affine plane line, 455 simple, 80 
affine scheme, 642 augmentation map, 149 
affine variety, 429, 562 
algebra, xxv Baer, 168 
abelian Lie, 78 base field, 327 
central, 111 base space, 640 
central simple, 111 Bayer-Stillman ordering, 494 


721 


722 


Bezout, 449 
Bezout’s Theorem, 447, 453, 465, 471, 487, 
488 
bidegree, 617 
bifunctor, 223 
bihomogeneous polynomial, 617 
binary quadratic form, 3, 12 
similar, 74 
birational, 595 
map, 595 
birationally equivalent, 595 
Blichfeldt, 293 
boundary, 172 
map, 172 
operator, 172 
bounded sequence, 317 
bracket, 78 
Brauer equivalent, 124 
Brauer group, 126 
relative, 127 
Brauer’s Lemma, 91 
Buchberger, 450 
Buchberger’s algorithm, 506 


canonical class, 551 
canonical divisor, 551 
Cartan, E., 79 
Cartan, H., 168 
category 
abelian, 238 
additive, 233 
good, 169 
Cauchy sequence, 317 
Cayley, 77 
central algebra, 111 
central simple algebra, 111 
centralizer, 114 
chain complex, 171 
double, 257 
in abelian category, 240 
tensor product for, 258 
chain map, 154, 155, 173 
character 
Dirichlet, 62 
genus, 74 
multiplicative, 61 
principal Dirichlet, 62 
Chase, 141 


Index 


Chevalley, 165, 168 
Chinese Remainder Theorem, xxv, 30, 69, 
106, 314, 341, 367, 480, 483 
class field, Hilbert, 265 
class field theory, 265 
class group 
form 28 
ideal, 42, 265, 299, 330, 393 
class number, 299, 393 
Dirichlet, 7, 14 
co-invariant, 209 
co-invariants functor, 209 
coboundary, 174 
map, 174 
operator, 174 
cochain complex, 173 
cochain map, 154, 174 
cocycle, 174 
codomain of morphism, 232 
cohomology, 153, 174 
sheaf, 168, 171, 218, 643 
coimage in abelian category, 240 
cokernel, 175 
cokernel of morphism, 236 
universal mapping property of, 236 
common discriminant divisor, 272 
common index divisor, 272, 287, 310, 371 
commutator ideal, 78 
complete presheaf, 641 
complete valued field, 343 
equal-characteristic case, 398 
unequal-characteristic case, 398 
completion, 342 
universal mapping property of, 343 
complex, 171 
chain, 171 
cochain, 173 
double, 257 
flat, 259 
in abelian category, 240 
place, 383 
composition formula, 24 
condition (C1), 165, 518 
cone, 572, 633 
conic, 458 
conjugate, 266, 288, 383 
connecting homomorphism, 185, 187 


connecting morphism in abelian category, 248 


convergent infinite product, 51 
convergent sequence, 317 
coordinate, 455, 559 

affine local, 461 
coordinate hyperplane, 620 
coordinate ring 

affine, 579 

homogeneous, 584 
coordinate subspace, 619 
coproduct, xxv 
correspondence, one-one, xxiii 
countable, xxiii 
Cramer, 448 
Cramer’s paradox, 449 
Cramer’s rule, 448 
crossed-product algebra, 137 
cubic, 458 

extension, pure, 280 

number field, 279, 302 

twisted, 562 
cubical singular chain, 172 
cubical singular homology, 172 
cup product, 256 
curve, affine plane, 455 
curve, elliptic, 648 
curve, irreducible, 604 

affine, 529 

affine plane, 430, 524, 562 
curve projective plane, 458 
cycle, 172 
cyclic algebra, 122, 162, 163 
cyclotomic field, 309 


decomposition group, 368 
Dedekind, 77 


Dedekind Discriminant Theorem, 275, 371, 


379, 381 
Dedekind domain, xxvi, 266 
extension of, xxvi, 327, 417 
Dedekind example, 287, 302, 310 


Dedekind’s Theorem on Differents, 376 


defined at a point, 580, 585 
degenerate, 172 
degree, 153 
of divisor, 533 
of inseparability, 415 
residue class, 275, 354, 533 
total, 457 


Index 


transcendence, 413 


derived functor, 204 


Di 


formation of, 205 
long exact sequence for, 211, 214 
ckson, 122 


different, 279 


relative, 279, 372 


differential, 543, 547 
differential form, 541 
dimension 


Di 


geometric, 565 

Krull, 403, 424, 426, 528, 529, 
564, 566, 605, 619, 630, 639 

of affine algebraic set, 566 

of affine variety, 563 

of zero locus, 423 

ophantus, | 


direct product, restricted, 388 
direct sum in additive category, 233 
directed set, 438 


Di 
Di 
Di 
Di 
Di 
Di 
Di 


richlet, 2, 24, 77 

richlet box principle, 297 

richlet character modulo m, 62 
richlet class number, 7, 14 
richlet L function, 63 

richlet pigeonhole principle, 297 
richlet series, 56 


723 


Dirichlet Unit Theorem, 290, 292, 384, 390, 395 


Di 


richlet’s Theorem, 7, 50 


discrete, 290 
discrete absolute value, 338 
discrete valuation, 322 


defined over k, 529 


discriminant, 12 


absolute, 35, 267 
field, 35, 264, 267 
fundamental, 33 


of commutative semisimple algebra, 382 


of ordered basis, 267 
relative, 275, 381 


discriminant divisor, 272 

divisible abelian group, 196 
divisible module, 251 

division algorithm, generalized, 499 
divisor, 532 

divisor class, 532, 549 

divisor, principal, 532 


do: 


main of morphism, 232 


724 


dominant rational map, 595 
Double Centralizer Theorem, 115 
double chain complex, 257 
dual of fractional ideal, 372 


Eckmann, 168 
Eilenberg, 168 
Eisenstein, 12 
Eisenstein polynomial, 402 
elimination ideal, 512 
Elimination Theorem, 512 
elimination type ordering, 494, 512 
elliptic curve, 648 
enough injectives, 202 
enough projectives, 202 
epi, 233 
epimorphism, 233 
equal-characteristic case, 398 
equivalence class of forms 
ordinary, 13 
proper, 13 
equivalence of 
absolute values, 333 
completions, 383 
forms, 13, 32 
forms, improper, 13 
forms, proper, 13, 32 
ideals, 40, 298 
ideals, narrow, 40 
ideals, strict, 40, 298 
morphisms, 242 
Euler, 1, 3, 9, 50 
Euler product, 50, 54, 60 
first-degree, 60 
Euler’s Theorem, 516, 646 
exact complex, 175 
exact functor, 179 
left, 182 
right, 183 
exact on injectives, 222 
exact on projectives, 222 
exact sequence, 175 
in abelian category, 240 
long, 187, 188 
short, 175 
split, 200 
Exchange Lemma, 412 
Ext functor, 223 


Index 


extension 
normal, 435 
of Dedekind domain, xxvi, 327, 417 
of integrally closed domain, 610 
of valued field, 358 
purely transcendental, 409 
Extension Theorem, 512 


factor set, 133 
trivial, 135 
Fermat, 1, 3,9 
field discriminant, 35, 264, 267 
field of formal Laurent series, 347 
field of fractions, xxv 
field polynomial, 266 
fine sheaf, 218 
finiteness of class number, 390 
first-degree Euler product, 60 
flabby sheaf, 218 
flat complex, 259 
flat module, 257 
form 
binary quadratic, 3, 12 
class group, 28 
negative definite, 14 
positive definite, 14 
primitive aligned, 25 
reduced primitive, 18, 21 
Fourier inversion formula for finite abelian 
groups, 61 
fractional ideal, 321 
principal, 321 
relative dual of, 372 
free resolution, 152, 195 
Freudenthal, 168 
Frobenius element, 437 
Frobenius’s Theorem about division algebras 
over the reals, 118, 160 
function field, 419, 528, 580, 582, 585, 587 
in one variable, 326, 382, 528, 529 
inr variables, 419 
functor 
additive, 170, 178 
co-invariants, 209 
derived, 204 
exact, 179 
Ext, 223 
global-sections, 218 


homology-of-groups, 209 

invariants, 208 

left exact, 182 

right exact, 183 

Tor, 224 
functorial, 177 
functoriality of long exact sequence of 

derived functors, 215, 218 

functoriality with long exact sequence, 191 
functoriality with snake diagram, 190 
fundamental discriminant, 33 
fundamental parallelotope, 293 
Fundamental Theorem of Galois Theory, 443 
fundamental unit, 36, 288 


Galois, 77 
Galois group, 434 
gap sequence, 557 
Gauss, 1, 3, 9, 24, 77 
Gauss’s group, 5, 28 
Gelfand, 348 
generalized division algorithm, 499 
generalized quaternion algebra, 121 
generalized resultant, 514 
genus, 32, 539, 556, 557 
principal, 33 
genus character, 74 
genus group, 33, 70, 73 
geometric dimension, 565 
germ, 584 
global field, 382 
global-sections functor, 218 
good category, 169 
graded lexicographic ordering, 493 
graded monomial ordering, 627 
graded reverse lexicographic ordering, 494 
Grobner, 450 
Grobner basis, 450, 497, 564 
minimal, 508 
reduced, 509 
Grothendieck, 638 


Haar measure, 385 

Halphen, 450 

Hamilton, 77 

Hensel, 279 

Hensel’s Lemma, 349, 351, 353, 399 
Herstein, 130 


Index 


Hilbert, 404 
Hilbert Basis Theorem, xxvi, 491, 560 
Hilbert class field, 265 
Hilbert function, 633 

affine, 621, 626 
Hilbert polynomial, 633 

affine, 625, 628 
Hilbert’s Theorem 90, 71, 145 
homogeneous coordinate ring, 584 
homogeneous ideal, 458, 570 
homogeneous member of homogeneous 

coordinate ring, 585 

homogeneous Nullstellensatz, 572, 586, 635 
homogeneous polynomial, 457 
homology, 153, 172 

cubical singular, 172 

simplicial, 172 
homology-of-groups functor, 209 
homomorphism, 78 

connecting, 185, 187 

inflation, 254 

of valued field, 342 

restriction, 254 
homotopic, 154, 173, 174, 193, 198 
homotopy, 173, 174, 193, 198 
Hopf, H., 167 
Hopkins, 92 
Hurewicz, 167 
hyperplane coordinate, 620 
hypersurface, irreducible affine, 430, 562 
hypersurface, irreducible projective, 573 


ideal 
fractional, 321 
in Lie algebra, 78 
principal fractional, 321 
valuation, 322 
ideal class group, 42, 265, 299, 330 
idele, 390 
idele class group, 393 
idempotent, 91, 369 
idempotent primitive, 369 
image in abelian category, 240 
Implicit Function Theorem, 428, 600 
improper equivalence of forms, 13 
independent algebraically, 409 
index, 272 
ramification, 275, 354 


725 


726 


inertia group, 370 
inertia subfield, 368 
inflation homomorphism, 254 
inflation-restriction sequence, 254 
injective, 195 

in abelian category, 241 
injective module, 195 
injective resolution, 199, 205 
inseparable element, 414 
integral closure, xxvi, 610 
integral domain, xxv 
integral element, xxvi 
integrally closed, xxvi 
intersection multiplicity, 467, 474 
intersection number, 467 
invariant, 208 
invariants functor, 208 
inverse limit, 439 

standard, 439 
inverse system, 438 
irreducible 

affine algebraic set, 563 

affine curve, 529 

affine hypersurface, 430, 562 

affine plane curve, 430, 524, 562 

closed set, 564, 573 

curve, 604 

element, xxv 

ideal, 446 

projective hypersurface, 573 
irredundant, 446 
isomorphic idempotents, 97 
isomorphism, 78 

of valued field, 342 

of varieties, 591 


Jacobi, 521 

Jacobi identity, 77 
Jacobi symbol, 68 
Jacobson radical, 89 


kernel of morphism, 235 
universal mapping property of, 235 
Koszul, 168 
Kronecker, 77 
Krull dimension, 403, 424, 426, 528, 529, 
564, 566, 605, 619, 630, 639 
Kummer, 77 


Index 


Kummer’s criterion, 275 
Kiinneth Theorem, 258-259 


Lagrange, 1, 4 
Langlands reciprocity, 265 
largest domain, 583, 595 


Lasker—Noether Decomposition Theorem, 446, 


639 
lattice, 290 
Law of Quadratic Reciprocity, 3, 8 
least common multiple, 501 
left adjoint, 252 
left Artinian ring, 87 
left exact functor, 182 
left Noetherian ring, 87 
left semisimple ring, 81 
Legendre, 1, 4 
Legendre symbol, 8 
Leibniz, 7 
Leray, 168 
Levi, E. E., 79 
lexicographic ordering, 493 
Lie algebra, 77 
abelian, 78 
semisimple, 79 
simple, 79 
solvable, 78 
Lie subalgebra, 78 
line 
affine plane, 455 
at infinity, 459 
projective, 458 
Liouville, 521 
local expression, 462 
local field, 383 
local morphism, 642 
local ring, xxvi 
at a point, 580, 582, 585, 587 
local/global approach, 371 
localization, xxvi 
locus of common zeros, 429, 559, 571 
long exact sequence, 187, 188 
functoriality with, 191 
of derived functors, 211, 214 
functoriality of, 215, 218 


Mac Lane, 168, 420 
Macaulay, 627 


maps of a good category, 169 
matrix units, 101 
member in abelian category, 242 
minimal Grobner basis, 50 
Minkowski, 301, 302 
Minkowski Lattice-Point Theorem, 293, 384 
modules of a good category, 169 
monic, 232 
mono, 232 
monomial, 457 
reduced, 646 
monomial ideal, 619 
monomial ordering, 493 
graded, 627 
monomorphism, 232 
morphism, 169 
local, 642 
of affine scheme, 642 
of ringed space, 642 
of varieties, 591 
multiplicative, 60 
multiplicative character, 61 
strictly, 60 
multiplicity of a tangent line, 478 


Nakayama’s Lemma, xxv, 120, 605, 606 
narrow equivalence of ideals, 40 
natural, 177 
negative, Xxili 
negative definite form, 14 
negatively oriented, 40 
neighbor, 21 

on the left, 21 

on the right, 21 
nil left ideal, 89 
nilpotent element, 89 
nilpotent left ideal, 80, 90 
Noether Normalization Lemma, 612 
Noether—Jacobson Theorem, 130 
Noetherian, xxvi 
Noetherian ring, 87 
Noetherian topological space, 564 
nonarchimedean, 331, 335, 338 
nonarchimedean place, 383 
nonsingular curve, 604 
nonsingular point, 429, 600, 601 
nontrivial absolute value, 332 
norm, 165, 356 


Index 


norm of ideal, 39 
absolute, 273 
normal extension, 435 
normalized absolute value, 383, 384, 385, 486 
Nullstellensatz, 403, 404, 428, 455, 480, 487, 
510, 516, 518, 524, 526, 529, 559, 561, 
563, 572, 579, 580, 581 
homogeneous, 572, 586, 635 
number field, xxvi 
cubic, 279, 302 
cyclotomic, 309 
quadratic, 35, 69, 263, 269 


Oka, 168 
one-one correspondence, xxiii 
order, 532 
order of vanishing, 474 
ordering 
Bayer-Stillman type, 494 
from tuple of weight vectors, 494 
graded lexicographic, 493 
graded monomial, 627 
graded reverse lexicographic, 494 
k-elimination type, 494, 512 
lexicographic, 493 
monomial, 493 
total, 493 
ordinary equivalence class of forms, 13 
oriented, 40 
orthogonal idempotents, 99, 369 
Ostrowski, 348 
Ostrowski’s Theorem, 336 


p-adic absolute value, 316 
p-adic integer, 279, 318 
p-adic integer, 346 
p-adic metric, 316 
p-adic number, 279, 316, 318 
p-adic number, 346 
parallelotope, 548 
fundamental, 293 
Peirce decomposition, 95 
perfect field, 418, 554 
place, 383 
plane, projective, 456 
plane curve 
affine, 455 
irreducible affine, 430, 524, 562 


727 


BeatriceGloria_personal library 


728 


projective, 458 
plane line, affine, 455 
Pliicker, 450 
point, 455, 456, 459, 559 
points at infinity, 459 
pole part, 537 
pole set, 581, 585 
positive, xxiii 
positive definite form, 14 
positively oriented, 40 
presheaf, 640 
complete, 641 
primary ideal, 445 
prime element, xxv 
prime ideal, xxv 
associated, 446 
primitive, 12 
primitive form 
aligned, 25 
reduced, 18, 21 
primitive idempotent, 369 
primitively represent, 14 
principal Dirichlet character, 62 
principal divisor, 532 
principal fractional ideal, 321 
principal genus, 33 
problem 
ideal-equality, 510 
ideal-membership, 507 
proper-ideal, 507 
product, xxv 
profinite group, 441 
projective, 192 
algebraic set, 571 
closure, 575 
hypersurface, irreducible, 573 
in abelian category, 241 
limit, 439 
line, 458 
module, 192 
n-space, 457 
plane, 456 
plane curve, 458 
resolution, 195, 205 
transformation, 460 
variety, 572 
proper equivalence class of 
forms, 13 


Index 


forms over Q, 32 
forms over Z, 13 
pullback, 242 
pure cubic extension, 280 
type of, 281 
purely inseparable element, 415 
purely inseparable extension, 416 


purely transcendental extension, 409 


pushout, 202, 243 


quadratic form, binary, 3, 12 
quadratic form, similar, 74 


quadratic number field, 35, 69, 263, 269 


quadratic reciprocity, 3, 8, 68 
quartic, 458 

quasi-affine variety, 568 
quasiprojective variety, 573 
quaternion algebra, 121 


radical 
associative algebra, 80 
ideal, 405 
Jacobson, 89 
of Lie algebra, 78 
Wedderburn—Artin, 89, 91 
ramification index, 275, 354 
ramified, 367 
ramify, 264, 275, 308 
rational function, 580, 585 
rational map, 595 
dominant, 595 
rational point, 455, 456, 457, 459 
real place, 383 
reciprocity 
Artin, 265 
Langlands, 265 
quadratic, 3, 8, 68 
reduced Grébner basis, 509 
reduced monomial, 646 
reduced norm, 165 
reduced polynomial, 165 
reduced primitive form, 18, 21 
reduced trace, 165 
reducible ideal, 446 
regular at a point, 580, 582, 585 
regular function at a point, 587 


regular function on an open set, 580, 582, 587, 


641 


regular point, 429 
relative Brauer group, 127 
relative different, 279, 372 
relative discriminant, 275, 381 
relative dual of fractional ideal, 372 
represent, 14 
primitively, 14 
residue class degree, 275, 354, 533 
residue class field, 322 
Residue Theorem, 543 
resolution, 194 
acyclic, 219 
free, 152, 195 
injective, 199, 205 
projective, 195, 205 
standard, 149 
restricted direct product, 388 
restriction homomorphism, 254 
resultant, 449, 451 
generalized, 514 
Riemann, 521 
Riemann hypothesis, 530 
Riemann sphere, 328 
Riemann surface, 522 
Riemann zeta function, 52, 58 
Riemann-Roch Theorem, 520, 522, 523, 530, 
540, 543, 551, 552, 648 
Riemann’s inequality, 538 
right adjoint, 252 
right Artinian ring, 87 
right exact functor, 183 
right Noetherian ring, 87 
right semisimple ring, 81 
ring of formal power series, 347 
ringed space, 642 


S-polynomial, 502 
scheme, 642 

affine, 642 

defined over a ring, 643 
Schmidt, 422 
Schreier, 168 
Schur’s Lemma, 83 
section, 641 
Segre embedding, 617, 646 
Segre variety, 617 
semisimple 

associative algebra, 80 


Index 729 


Lie algebra, 78 
module, xxiv 
ring, 81, 84 
separable 
algebraic closure, 434 
element, 414 
extension, 415 
polynomial, 414 
semisimple algebra over a field, 109 
separably generated extension, 419 
separating transcendence basis, 419 
sheaf, 168, 640 
fine, 218 
flabby, 218 
cohomology, 168, 171, 218, 643 
structure, 641 
short exact sequence, 175 
in abelian category, 241 
similar binary quadratic form, 74 
simple 
associative algebra, 80 
Lie algebra, 79 
module, xxiv, 80 
ring, 85 
simplicial homology, 172 
singular cube, 172 
singular homology, 172 
singular point, 429, 600, 601 
Skolem—Noether Theorem, 113 
snake diagram, 185, 261 
functoriality with, 190 
Snake Lemma, 185, 248 
solution of problem 
ideal-equality, 510 
ideal-membership, 507 
proper-ideal, 507 
solvable Lie algebra, 78 
spectral sequence, 171 
spectrum, 639 
split algebra, 127 
split exact sequence, 200 
splitting field, 127 
stalk, 640 
standard inverse limit, 439 
standard resolution, 149 
standard subset, 622 
Stickelberger’s condition, 309 
Stone, 638 


730 


strict equivalence of ideals, 40, 298 
strictly multiplicative, 60 
strong approximation property, 374 


Strong Approximation Theorem, 373, 390, 391 


structure sheaf, 641 
subalgebra, Lie, 78 
summation by parts, 56 


tangent lines, 478 
tensor product of 
algebras, 104 
chain complexes, 258 
fields, 104 
Theorem 90, Hilbert’s, 71, 145 
Tor functor, 224 
Tornheim, 349 
torsion abelian group, 169 
torsion submodule, 257 
total degree, 457 
total ordering, 493 
totally ramified, 367 
trace, 165 
transcendence basis, 409, 424 
existence, 411 
separating, 419 
transcendence degree, 413 
transcendence set, 409 
translate of form, 26 
triangular ring, 88 
trivial absolute value, 331 
trivial factor set, 135 
twisted cubic, 562 
type of pure cubic extension, 281 


ultrametric inequality, 316, 331 
unequal-characteristic case, 398 
uniformizer, 323 
uniformizing element, 323 
unit, xxiv, 36, 288 
fundamental, 36, 288 
unital, xxiv 
Universal Coefficient Theorem, 261 
universal mapping property of 
cokernel, 236 
completion of valued field, 343 
kernel, 235 


Index 


unramified, 367 


valuation, 322, 331 
archimedean, 289 
discrete, 322, 529 

valuation ideal, 322 

valuation ring, 322 

valued field, 342 
complete, 343 
extension of, 358 
homomorphism of, 342 
isomorphism of, 342 

variety, 590 
affine, 429, 562 
as a scheme, 643 
projective, 572 
quasi-affine, 568 
quasiprojective, 573 
Segre, 617 


Weak Approximation Theorem, 340, 374 

Wedderburn, 79, 86, 164 

Wedderburn—Artin radical, 89, 91 

Wedderburn’s Main Theorem, 94 

Wedderburn’s Theorem about finite division 
rings, 117, 160 

Wedderburn’s Theorem about semisimple 
rings, 83 

Weierstrass, 521 

Weierstrass gap, 557 

Weierstrass point, 557 

Weierstrass valuation, 557 

weight vectors, 494 

Weil, 530, 541, 543 

Weyl algebra, 85 


Zariski closure, 561, 578 
Zariski topology, 560, 571 
Zariski’s Theorem, 403, 431, 525, 558, 
600, 601, 605, 606 
zero locus, 455, 559, 571 
zero member, 245 
zero morphism, 233 
zero object, 233 
zeta function, 530 
Riemann, 52, 58 


